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Abstract 

Receiver  Operating  Characteristic  (ROC)  curves  are  used  to  compare  the  ef¬ 
fectiveness  of  IR  image  processing  techniques.  Two  non-parametric  error  estimation 
techniques  (k-Nearest  Neighbor  and  Parzen  Window)  are  used  to  create  estimates 
of  the  probability  density  functions  for  the  data.  These  pdfs  are  used  in  the  cre¬ 
ation  of  the  ROC  curves  for  both  resubstitution  and  leave-one-out  estimates.  These 
estimates  generate  the  upper  and  lower  bounds,  respectively,  on  the  ROC  curves. 
The  ROC  curve  analysis  is  performed  on  the  outputs  of  various  image  processing 
techniques  and  the  resulting  ROC  curves  are  used  to  compare  the  techniques.  Of 
the  image  processing  techniques  used  in  this  thesis,  the  close  minus  open  (CMO) 
morphological  filter  operation  produced  the  best  results. 
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ROC  Analysis  of  IR  Segmentation  Techniques 


L  Introduction 

1.1  Background 

Image  processing,  segmentation,  and  target  recognition  are  important  areas  of 
research  to  the  Air  Force.  The  object  of  image  processing,  segmentation,  and  target 
recognition  is  to  automatically  identify  regions  of  the  image  that  contain  targets, 
without  requiring  human  intervention.  This  will  allow  the  pilot  of  an  aircraft  to 
concentrate  on  other  things  such  as  maneuvering  through  enemy  fire.  Automatic 
segmentation  will  also  identify  targets  that  may  be  difficult  for  the  pilot  to  see 
because  they  are  either  too  far  away  or  camoflauged.  Typical  infrared  images  are 
shown  in  Figures  1.1  and  1.2. 

There  are  many  image  processing  techniques  available,  such  as  spatial  feature 
extraction,  morphological  filters,  and  hit-miss  transforms,  but  none  is  perfect  in 
every  situation.  These  techniques  are  based  upon  probabilities  and  statistics  and, 
therefore,  have  errors  associated  with  them.  Because  there  are  errors,  a  method  for 
comparing  the  efficiency  of  each  technique  must  be  found.  One  way  to  do  this  is 
through  Receiver  Operating  Characteristic  (ROC)  curves. 

ROC  analysis  was  based  on  statistical  decision  theory  and  developed  in  signal 
detection  theory  [13,  19].  In  the  past,  the  performance  of  classification  systems  was 
measured  by  the  percentage  of  correct  decisions,  but  that  “percent  correct”  does  not 
account  for  the  false-positive  and  false-negative  errors  involved  [13].  For  example, 
if  5%  of  people  have  a  particular  disease,  then  a  system  can  be  95%  accurate  by 
calling  everyone  negative.  The  problem  with  this  number  is  that  it  does  not  take  into 
account  that  the  system  fails  to  find  any  positives.  ROC  analysis  solves  this  problem 
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Figure  1.1  Typical  Infrared  Image  (range=5775  ft). 


Figure  1.2  Typical  Infrared  Image  (range=4100  ft). 
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by  plotting  true  positive  fractions  (TPF)  against  false  positive  fractions  (FPF)  [13]. 
In  the  area  of  target  recognition  and  in  this  thesis,  these  are  the  probabilities  of  hit 
and  false  alarm,  respectively. 

In  order  to  create  the  ROC  curves  from  finite  sample  sets,  the  probability  den¬ 
sity  functions  need  to  be  estimated.  Two  common  non-parametric  density  estimation 
techniques  are  the  k-Nearest  Neighbor  estimation  proposed  by  Fix  and  Hodges,  and 
the  Parzen  Window  approach  proposed  by  Parzen  [6,  15].  These  non-parametric 
techniques  for  estimating  the  pdfs  are  used  because  they  do  not  assume  a  functional 
form  for  the  pdf. 

Resubstitution  and  leave-one-out  methods  are  used  to  create  the  upper  and 
lower  bounds,  respectively,  on  the  ROC  curves.  The  resubstitution  method  involves 
using  all  available  samples  to  train  the  classifier,  and  then  testing  with  those  same 
samples.  Lachenbruch  proposed  the  leave-one-out  method  which  involves  using  all 
but  one  sample  to  train  the  classifier,  testing  with  that  one  sample,  and  repeating 
the  process  for  all  data  samples  [10].  Fukunaga  and  Hummels  showed  that  the 
resubstitution  and  leave-one-out  techniques  can  be  applied  to  k-NN  and  Parzen 
classifiers  to  obtain  upper  and  lower  bounds  on  the  Bayes  error  rate  [7].  Their 
technique  is  used  in  this  research  to  find  the  density  estimations  from  which  the 
ROC  curves  are  created. 

1.2  Problem  Statement 

This  thesis  focuses  on  estimating  the  ROC  curves  using  various  segmentation 
techniques  to  segment  infrared  images,  and  then  processing  the  segmented  images 
to  classify  the  pixels  into  two  classes. 

1.3  Scope 

Actual  infrared  imagery  taken  by  aircraft  flying  over  targets  in  a  desert  back¬ 
ground  is  used  in  producing  the  ROC  curves.  This  imagery  was  obtained  from 
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the  Wright  Laboratories  Automatic  Target  Recognition  Branch  at  Wright-Patterson 
AFB.  A  completely  different  set  of  imagery  was  obtained  for  the  University  of  Cincin¬ 
nati.  This  imagery  contained  x-ray  images  of  breast  cancer  tumors,  both  malignant 
and  benign.  Although  the  ROC  curves  generated  in  this  thesis  apply  specifically  to 
the  images  used,  the  software  can  be  used  to  obtain  ROC  curves  for  a  variety  of 
targets  in  various  backgrounds.  This  thesis  concentrates  on  using  only  k-NN  and 
Parzen  Window  techniques  to  produce  probability  estimates  for  a  two-class  problem: 
target  and  non-target. 

1.4  Approach 

The  approach  taken  consists  of  first  implementing  current  image  processing 
techniques:  morphology  and  spatial  feature  extraction.  Then  Parzen  Window  and 
k-Nearest  Neighbor  techniques,  described  by  Fukunaga  and  Hummels  and  validated 
by  Martin,  are  used  to  estimate  probability  density  functions  from  the  images  [7,  11]. 
From  the  probabilities  obtained,  ROC  curves  are  constructed. 

1.5  Research  Objectives 

The  research  objectives  for  this  thesis  are  as  follows: 

1.  Implement  image  processing  techniques:  morphology  and  spatial  feature  ex¬ 
traction  [3,  8]. 

2.  Implement  Parzen  and  k-NN  error  estimation  approaches  to  construct  ROC 
curves  for  each  processing  technique  [7,  11,  12]. 

3.  Show  that  the  ROC  curve  analysis  is  consistent  from  one  image  to  the  next. 

4.  Develop  an  understanding  of  how  each  image  processing  and  error  estimation 
technique  works,  as  well  as  how  to  interpret  the  results. 
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1.6  Thesis  Organization 

The  rest  of  this  thesis  is  organized  as  follows;  Chapter  II  contains  the  theory 
behind  the  probability  and  error  estimations,  and  each  image  processing  technique. 
Chapter  III  discusses  the  methods  used  to  implement  the  processing  techniques  and 
error  estimations.  Chapter  IV  presents  the  results  obtained  from  applying  each 
technique.  Finally,  Chapter  V  gives  an  overall  summary  of  this  research. 

1.7  Summary 

Image  segmentation  and  target  recognition  are  ongoing  areas  of  research.  This 
thesis  investigates  image  processing  techniques  as  well  as  a  method  for  measuring 
their  performances  in  segmentation  applications. 


11.  Theory 


This  chapter  describes  the  theory  behind  the  methods  used  in  this  thesis.  The 
following  topics  are  discussed: 

•  Probability  theory 

•  Non-parametric  density  estimation 

•  Receiver  Operator  Characteristic  curves 

•  Morphology. 

2.1  Probability  Theory 

Probability  theory  is  a  mathematical  method  used  to  describe  random  phe¬ 
nomena  which  are  not  deterministic,  but  can  be  approximately  described  by  the 
relative  frequency  of  the  occurence  of  the  possible  outcomes  [18].  A  simple  example 
of  this  is  the  tossing  of  a  fair  coin.  Although  the  outcome  on  any  given  toss  cannot 
be  predicted,  it  is  likely  that  after  many  tosses,  approximately  half  of  the  outcomes 
will  be  heads  and  the  other  half  tails. 

The  probability  density  function  (pdf)  is  a  representation  of  the  distribution 
of  the  outcomes  of  a  random  experiment.  It  shows  the  probability  for  observing  any 
outcome  on  a  given  trial  of  the  experiment.  The  pdf  of  a  single  real  variable  has  the 
following  properties: 


f{x)  >  0  for  all  X 

(2.1) 

poo 

/  f{x)dx  =  1. 

J—oo 

(2.2) 

The  pdf  for  the  coin  toss  mentioned  above  can  be  seen  in  Figure  2.1.  Another 
example,  this  time  a  continuous  pdf,  can  be  seen  in  Figure  2.2.  In  this  case,  the 
probability  of  the  outcome  of  the  experiment  falling  between  points  a  and  b  is  the 
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shaded  area  described  by 


(2.3) 


p(x) 


Figure  2.1  The  probability  density  function  for  the  coin  toss. 


Once  pdfs  have  been  estimated,  error  probabilities  can  be  found.  When  two 
conditional  pdfs  are  plotted  on  the  same  axis  as  in  Figure  2.3,  a  decision  threshold 
(t)  can  be  placed  at  any  point  on  the  axis  and  a  decision  rule  can  be  declared.  The 
decision  rule  used  here  is  the  Bayes  decision  rule  where 

P{si\x)  >  P{s2\x)  (2.4) 

S2 
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or 


P{x\si)P{si)  5  p(^k2)i^(^2) 

p{x)  p{x)  ^  ^ 

[16].  For  example,  given  the  pdfs  in  Figure  2.3,  the  decision  rule  would  be  to  choose 

Si  if  the  outcome  falls  to  the  left  of  the  decision  threshold  and  S2  if  it  falls  to  the 

right.  In  following  this  decision  rule,  the  probability  of  error  for  Si  (an  Si  point 

that  is  declared  ^2)  would  be  the  area  under  the  Si  pdf  that  lies  to  the  right  of  the 

threshold.  Similarly,  the  probability  of  error  for  S2  would  be  the  area  under  the  S2 

pdf  that  lies  to  the  left  of  the  threshold. 


f(x) 


p(xlSi)P(Si) 


p(xlS2)P(S2) 


p(errlsJ  p(errls,) 

< - - - 1 


Figure  2.3  Example  pdfs  showing  error  probabilities. 


2.2  Non-Parametric  Density  Estimation 

Non-parametric  density  estimation  is  a  way  of  estimating  the  pdf  of  a  random 
variable  given  a  finite  number  of  samples  without  assuming  a  functional  form.  In  this 
section,  the  two  estimation  techniques  used  in  this  thesis  will  be  described:  k-nearest 
neighbor  and  Parzen  window. 

2.2.1  k-Nearest  Neighbor.  The  k-nearest  neighbor  (k-NN)  technique  for 
estimating  pdfs  is  based  on  the  volume  created  by  enclosing  the  k  nearest  samples  to 
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a  data  point  by  a  contour  of  constant  distance.  If  this  volume  is  small,  that  means 
the  points  are  closely  spaced  and  the  probability  of  an  outcome  falling  in  that  region 
is  high.  Therefore,  the  pdf  is  large  at  that  point.  Conversely,  if  the  volume  is  large, 
the  points  are  spread  out  and  there  is  a  lower  probability  of  an  outcome  falling  in 
that  region.  Therefore,  the  pdf  is  small  at  that  point.  An  example  of  this  can  be  seen 
in  Figure  2.4.  (In  actuality,  the  contour  enclosing  points  along  a  one-dimensional 
line  is  only  a  length,  but  for  illustration  purposes  a  circle  is  used.) 


Figure  2.4  k-NN  density  estimation.  This  estimation  for  k=4  illustrates  how  the 
contours  enclose  the  4  nearest  points  to  each  chosen  point. 


The  k-NN  density  estimation  for  class  Si  at  point  x  is 


Pi{x) 


k-1 

JV,-  vi->{x) 


where  Ni  is  the  total  number  of  samples,  k  is  the  number  of  neighbors,  and  V^'\x) 
is  the  volume  inside  the  surface  of  constant  distance  which  encloses  those  k  nearest 
neighbors  [7,  11]. 
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To  calculate  the  volume  inside  the  surface  of  constant  distance,  a  distance 
metric  must  be  chosen.  The  one  used  in  this  thesis  is  the  squared  Mahalanobis 
distance  described  by 


y)  =  {^-  yf  IZ  ^  (x-y)  (2.7) 


where  ^  is  the  inverse  covariance  matrix  [11].  In  n-dimensional  space,  a  surface 
of  constant  distance  is  a  hyper-ellipsoid  having  a  volume 


(2.8) 

where 


rik{x)  =  ‘ 

yd?(x,4-iviv) 

(2.9) 

f  7r”/2 

Vd=  - 

1  (n/2)! 

11  even 

(2.10) 

1  2"7r(”‘ 

— ,  ^  ^  n  odd 

[4,  11].  Substituting  into  Equation  2.6,  the  k-NN  estimate  of  the  class  Si  pdf  at  point 
X  is 


Pi{x) 


_ ^-_1 _ 

El*'" 


(2.11) 


[11], 


2.2.2  Parzen  Window.  In  simple  terms,  the  Parzen  window  technique 
for  estimating  pdfs  is  obtained  by  placing  a  small  window  function  at  every  data 
point.  When  the  areas  of  all  the  window  functions  are  added  together,  the  resulting 
function  will  approximately  estimate  the  density  function.  A  simple  example  is 
shown  in  Figure  2.5. 
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Figure  2.5  Parzen  window  density  estimation. 

Mathematically,  the  Parzen  estimate  of  the  conditional  pdf  at  point  x  for  class 
Si  containing  data  in  n-dimensional  space  is 

=  (2.12) 

where  k{-)  is  a  kernel  (or  window)  function,  his  a,  parameter  that  controls  the  spread 
of  the  window,  Ni  is  the  number  of  samples,  and  Xj  are  the  samples  [7,  11]. 

The  window  used  in  this  thesis  is  a  Gaussian  with  the  following  functional 

form: 

M  =  (27^!  |^|f7^  exp[-^(x  -  ,if  X]  -  m)]  (2.13) 

where  //  is  a  mean  vector  and  X)  is  a  covariance  matrix  [4,  11].  The  equation  for  the 
window  kernel  is 


\  1 

1  -  ZiVTA 

1  ^  ) 

'  (27r)"/2|Ed 

1/2  ^ 

2h^ 
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where  the  varies  the  spread  of  the  window  [11].  Substituting  into  Equation  2.12, 
the  Parzen  density  estimate  with  the  Gaussian  window  function  is 


Piix) 


-I  N-i 

* 


1 

/i" 


(2,r)"/2E//^  h 


exp 


(2.15) 


2.S  Receiver  Operating  Characteristic  Curves 

For  an  in  depth  description  of  ROC  curves,  see  the  works  by  Egan  or  Metz 
[5,  13].  In  simple  terms  for  this  thesis,  a  ROC  curve  is  the  probability  of  a  hit  plotted 
versus  the  probability  of  false  alarm.  Given  two  overlapping  pdfs,  each  point  in  the 
shared  region  will  have  both  a  probability  of  a  hit  and  false  alarm  as  demonstrated 
in  Figure  2.6.  For  each  point,  the  pair  of  probabilities  are  plotted  together  as  in 
Figure  2.7  to  obtain  the  ROC  curve.  Creating  these  curves  for  each  image  processing 
technique  provides  a  means  to  measure  the  accuracy  of  each  technique  and  make 
comparisons  between  different  techniques. 


< — Eiit - 


Figure  2.6  Probability  of  hit  and  false  alarm  for  given  two  overlapping  condi¬ 
tional  pdfs. 
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Figure  2.7  An  example  ROC  curve.  Shows  the  resulting  ROC  curve  of  the  pdfs  in 
Figure  2.6. 

2.4  Morphology 

Mathematical  morphology  is  an  image  processing  tool  used  to  extract  image 
components  that  are  used  to  describe  shapes  in  the  image  [8].  Set  theory  is  used 
to  describe  mathematical  morphology,  and  sets  are  used  to  represent  the  shapes 
of  objects  in  the  image.  The  following  sections  describe  set  theory  and  the  basic 
functions  of  morphology. 

2.4.1  Basic  Definitions  of  Set  Theory.  In  order  to  understand  morphology, 
one  needs  to  know  some  basic  definitions  dealing  with  set  theory. 

First  of  all,  a  set  is  a  collection  of  objects  called  elements  [14].  For  this  thesis, 
these  elements  are  pixels  in  an  image,  so  a  set  is  a  collection  of  pixels. 

A  subset  is  another  set  whose  elements  are  also  elements  of  a  larger  set  [14]. 
This  is  denoted  by  T  C  if  T  is  a  subset  of  Z. 

The  union  of  two  sets  is  a  new  set  whose  elements  are  all  the  elements  of  the 
original  two  sets.  For  example,  if  set  Y  =  {1,2, 3,4, 5}  and  set  Z  =  {2,4, 6,8}  then 
the  union  is  Y  U  Z  =  {1,2,3,4, 5, 6,8}. 

The  intersection  of  two  sets  is  a  new  set  consisting  of  all  elements  that  are  com¬ 
mon  to  both  sets.  Using  sets  Y  and  Z  from  above,  the  intersection  isY  H  Z  —  {2, 4}. 
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The  following  definitions  were  taken  from  the  work  by  Gonzalez  and  Woods 
[8].  The  complement  of  set  A  consists  of  all  elements  not  in  A  and  is  defined  as 

A‘^^{x\x^A].  (2.16) 

An  example  is  shown  in  Figure  2.8. 

The  translation  (shift)  of  set  B  hy  x  =  {xi^X2)  is  defined  as 

{B)x  =  {c|c  =  6  +  x,  for  h  E  B].  (2-17) 

The  reflection  of  B  is  defined  as 

B  =  {a;|a;  =  —6,  for  b  6  B].  (2.18) 

The  difference  of  the  two  sets  A  and  B  is  defined  as 

A  —  B  =  {a;|a;  G  A,  x  ^  B}  =  An  B^^  (2.19) 

and  is  also  shown  in  Figure  2.8. 


Figure  2.8  Venn  diagrams  showing  the  complement  and  difference. 

For  the  morphology  examples  that  follow,  A  is  the  image  and  B  is  the  struc¬ 
turing  element  (kernel).  For  binary  morphology,  A  and  B  are  sets  in  the  2-D  integer 
space  with  components  a  =  (01,02)  and  b  =  (61,62). 
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Image  A 


Structuring  Element  B 


Dilation  Erosion 


Figure  2.9  Example  of  binary  morphology,  (photo-negative:  black=l,  white=0) 
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2.4- 2  Dilation.  Dilation  is  a  process  that  expands  the  image  A  by  the 
structuring  element  B.  The  process  consists  of  reflecting  B  and  then  finding  the  set 
of  all  shifts  X  where  A  and  B  overlap  by  at  least  one  nonzero  element.  This  dilation 
process  is  defined  by  the  following  equations: 

AQ)B  =  {x\{B),nA^$}  (2.20) 

A®B  =  {x\[{B),nA]CA}  (2.21) 

[8].  An  example  of  a  dilation  can  be  seen  in  Figure  2.9. 

2.4- 3  Erosion-  The  erosion  of  an  image  is  a  process  that  shrinks  image  A 
by  the  structuring  element  B.  The  process  consists  of  finding  the  set  of  all  points  x 
where  B  shifted  by  x  is  contained  in  set  A.  This  process  is  defined  by 

AeB  =  {x\{B)^CA}  (2.22) 

[8].  A  binary  erosion  can  be  seen  in  Figure  2.9. 

2-4-4  Opening-  Opening  is  a  combination  of  an  erosion  followed  by  a 
dilation.  The  opening  process  generally  smoothes  the  contour  and  eliminates  thin 
lines.  A  opened  by  B  is  described  by 

AoB  =  {AeB)®B  (2.23) 

[8].  An  example  is  shown  in  Figure  2.9.  This  image  illustrates  how  thin  lines  in  the 
image  are  removed.  The  thin  lines  in  the  lower  right  corner  and  the  small  square  in 
the  upper  right  corner  of  the  original  image  have  been  removed. 

2-4-5  Closing-  Closing  is  also  a  combination  process.  It  consists  of  a 
dilation  followed  by  an  erosion.  The  closing  process  tends  to  smooth  the  contour, 
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fuse  narrow  breaks,  and  eliminate  small  holes  in  the  image.  A  closed  by  B  is  described 

by 

A»B  =  (AeB}eB  (2.24) 

[8].  An  example  is  shown  in  Figure  2.9.  This  shows  how  the  closing  process  can  be 
used  to  eliminate  small  holes  in  the  image  that  could  be  due  to  noise.  Notice  that  the 
small  white  square  that  was  near  the  center  of  the  original  image  has  been  removed, 
but  the  larger  square  holes  remained  intact.  This  process  is  useful  for  removing  small 
(smaller  or  equal  to  the  size  of  the  structuring  element)  bits  of  noise  from  images. 

S.4-6  Gray-Scale.  Until  now,  this  section  has  dealt  with  binary  images, 
but  there  are  also  equations  governing  the  morphological  processing  of  gray-scale 
images.  The  functions  are  similar  to  binary  morphological  functions,  only  another 
dimension  has  been  added.  Gray-scale  dilation  is  described  by 

{A®B){s,t)  =  ma.x{A{s-x,t-y)AB{x,y)\{s~x),{t~y)  G  A;  (x,y)  G  B}  (2.25) 
and  gray-scale  erosion  is  described  by 

{AQB){s,t)  =  mm{A{s-\-x,t  +  y)- B{x,y)\{s  +  x),{tAy)  G  A;  {x,y)  G  B}  (2.26) 

[8]. 

Gray-scale  dilation  consists  of  shifting  the  structuring  element  to  all  positions 
in  the  image  where  at  least  one  pixel  overlaps,  as  shown  in  Figure  2.10.  For  each  shift, 
the  overlapping  regions  of  the  structuring  element  are  added  to  the  image  and  the 
maximum  pixel  value  in  that  new  region  becomes  the  pixel  value  for  the  new  dilated 
image.  Gray-scale  erosion  is  similar.  It  consists  of  shifting  the  structuring  element  to 
all  positions  in  the  image  where  the  structuring  element  is  entirely  contained  in  the 
image  as  shown  in  Figure  2.10.  For  each  shift,  the  structuring  element  is  subtracted 
from  the  image  and  the  minimum  pixel  value  becomes  the  new  eroded  pixel  value. 
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B 


A 


A 


At  each  shift,  the  two  overlapping  regions 
are  added  together  and  the  maximum 
value  is  put  at  point  P  in  a  new  image. 


erosion 

At  each  shift,  the  two  overlapping  regions 
are  subtracted  and  the  minimum  value  is 
put  at  point  P  in  a  new  image. 


Figure  2.10  Gray-scale  dilation  and  erosion. 


Just  as  in  binary  morphological  operations,  the  gray-scale  opening  of  an  image 
is  formed  by  performing  an  erosion  followed  by  a  dilation  of  the  eroded  image. 
Similarly,  the  gray-scale  closing  is  formed  by  performing  a  dilation  followed  by  an 
erosion.  These  basic  gray-scale  morphological  functions  ultimately  lead  to  the  CMO 
(close  minus  open)  algorithm.  This  CMO  algorithm  is  used  for  clutter  reduction  in 
IR  images  by  removing  large  area  background  regions  and  converting  all  objects  in 
the  scene  to  hot  [1].  An  example  showing  and  describing  each  stage  in  the  gray-scale 
CMO  process  can  be  seen  in  Chapter  III. 


2.5  Conclusion 

This  chapter  described  the  basic  theory  behind  the  methods  used  in  this  re¬ 
search  to  process  images  and  measure  the  classification  errors  incurred.  The  next 
chapter  describes  how  this  theory  is  implemented  using  both  Khoros  and  Matlab 
software. 
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III.  Methodology 


Some  of  the  image  processing  was  done  in  Khoros,  so  a  quick  tutorial  on 
Khoros  is  included  in  Appendix  A.  The  rest  of  the  code  used  in  this  research  is 
included  in  Appendices  B,  C,  and  D.  This  chapter  will  describe  the  methods  used  to 
retrieve  the  images  from  the  8mm  tapes,  prepare  them  for  processing,  extract  spatial 
features,  perform  the  morphological  transform,  and  find  the  errors  incurred  through 
classification. 

3.1  Preparing  the  Images 

The  images  were  received  on  8mm  tape  in  ARF  format  from  Wright  Labora¬ 
tories  Automatic  Target  Recognition  Branch.  Along  with  the  images  came  the  code 
needed  to  transform  the  data  from  ARF  to  the  VIFF  format  used  in  Khoros.  This 
transformation  is  achieved  by  creating  an  arf2viff  glyph  in  the  Cantata  workspace. 
This  is  done  by  selecting  from  the  Cantata  menu  workspace/file  utilities/UIS 
filename/arf2vifF.pane.  Now  the  images  can  be  viewed,  processed  in  Khoros,  or 
saved  into  matrix  files  to  be  imported  into  Matlab. 

3.S  Khoros  Spatial  Bands 

The  Khoros  image  processing  environment  has  a  built-in  function  to  extract 
spatial  features  from  images.  These  features  are  the  mean,  variance,  contrast,  an¬ 
gular  2"“^  moment,  entropy,  and  dispersion.  The  following  equations  that  describe 
how  each  spatial  feature  is  calculated  were  obtained  from  the  Khoros  manual  by 
Donohoe  [3].  The  algorithm  assumes  there  are  K  gray  levels  in  the  image.  In  this 
case,  K  =  255  and  p{k)  =  the  probability  of  occurrence  of  each  gray  level  k,  where 
k  =  0,...,255. 

K 

mean,  g  =  E[k]  =  ^kp{k)  (3.1) 

k=0 
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K 


K 


variance  =  E[k'^\  —  E[kY  =  ^  k^  p{k)  — 

k=0 

K 

contrast  =  '£Pp{k) 

fcrrO 

K 

angular  2“^  moment  =  p^jk) 

k=0 

K 

entropy  =  -  P(^) 


J2kp{k) 

Lfc=o 


A;=0 


K 


dispersion  =  J2\k-p\  p{k) 
k=0 


(3.2) 

(3.3) 

(3.4) 

(3.5) 

(3.6) 


For  each  feature  band,  the  window  is  shifted  to  be  centered  on  every  point  in 
the  original  image  where  the  window  is  fully  contained  in  the  image,  as  shown  in 
Figure  3.1.  The  window  size  used  in  the  calculations  for  this  thesis  is  3x3.  Each  of 
the  six  spatial  feature  extraction  techniques  are  performed  on  the  pixels  within  the 
windowed  region,  with  the  results  becoming  the  new  pixel  values  in  the  six  extracted 
image  bands.  Due  to  the  way  the  window  is  placed,  there  will  be  a  one  pixel  border 
of  zeros  in  the  new  bands.  For  more  details  on  the  spatial  feature  band  extraction, 
see  Appendix  A.  An  actual  example  showing  an  original  image  along  with  the  six 
spatial  bands  extracted  from  can  be  seen  in  Figures  3.2  and  3.3.  (These  bands  have 
been  scaled  so  the  maximum  gray  level  is  255.) 


3.3  Morphology 

The  close  minus  open  morphological  transform  that  was  described  in  Chapter 
II  is  performed  on  the  image  using  the  following  Matlab  script  file  and  functions: 
cmo.m,  closing. m,  opening. m,  erosion. m,  and  dilation. m.  These  are  included  in 
Appendix  B. 

The  script  file  cmo.m  loads  the  image  data  file  and  allows  the  user  to  define  the 
morphological  kernel,  B.  These  two  data  matrices  are  then  passed  to  the  function 
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Image 


For  each  shift,  the  spatial  algorithms  are  implemented  and 
the  results  are  put  in  the  center  point  (p)  of  the  window  in 
the  new  image  bands.  (The  3x3  window  produces  a  one 
pixel  border  of  zeros  in  the  new  image  bands.) 

Figure  3.1  How  the  Khoros  spatial  bands  are  created. 


Figure  3.2  The  original  image. 
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Band  1  (mean)  Band  2  (variance) 


Band  3  (contrast)  Band  4  (angular  2”^^  momen 


Band  5  (entropy)  Band  6  (dispersion) 


Figure  3.3  Khoros  spatial  band  images. 
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closing. m  which  performs  a  dilation  of  the  original  image  and  then  an  erosion  of 
the  dilated  image.  Then  the  original  two  matrices  are  passed  to  opening. m  which 
performs  an  erosion  followed  by  a  dilation.  Finally,  the  opened  image  is  subtracted 
from  the  closed  image  to  complete  the  close  minus  open  transform,  and  the  resulting 
image  is  ready  to  be  put  into  the  classifying  code.  A  step  by  step  example  using  real 
images  and  showing  each  of  these  stages  is  shown  in  Figures  3.4  and  3.5. 

Image  A  Kernel  B 


Erosion 


Opening 


Figure  3.4  A  step  by  step  example  of  gray-scale  morphology. 


Closing 


Dilation 
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Close  Minus  Open 


Figure  3.5  The  output  for  the  gray-scale  close  minus  open  function. 

In  Figure  3.4  the  kernel  is  chosen  to  match  the  size  of  the  target  object  (the 
scud  in  this  case).  The  dilation  expands  the  image  and  creates  large  bright  areas 
where  there  were  bright  (hot)  areas  in  the  original  image.  In  this  case,  there  is  a 
large  bright  area  corresponding  to  the  scud  location.  In  just  the  opposite  way,  the 
erosion  shrinks  an  image  and  creates  large  dark  areas  where  there  were  dark  (cold) 
spots  in  the  original  image. 

Ideally,  the  result  of  a  closing  should  be  bright  areas  where  there  are  bright 
targets  in  the  image,  while  the  result  of  the  opening  should  be  dark  areas  where  there 
are  dark  targets  in  the  image.  In  both  cases,  the  backgrounds  should  be  similar,  so 
when  the  CMO  is  completed,  the  background  will  subtract  away,  leaving  both  hot 
and  cold  targets  brighter  than  the  new  background  level.  Figure  3.5  shows  how  the 
scud  location  is  brighter  than  the  background.  The  locations  of  the  small  targets 
in  the  foreground  are  only  slightly  brighter  than  the  areas  around  them.  The  CMO 
transform  did  not  perform  as  well  on  the  small  targets  because  the  kernel  size  was 
chosen  to  match  the  scud. 

3.4  Classifying  Code 

The  code  used  to  find  the  error  incurred  through  classification  was  obtained 
from  Curtis  Martin  and  modified  to  give  the  output  information  of  interest  to  this 
thesis  work  [11]. 
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Matlab  script  files  are  used  to  load  the  output  of  the  various  image  processing 
techniques  and  put  the  data  into  a  form  suitable  for  use  in  the  classifying  code. 
These  files  pass  a  target  matrix,  a  background  matrix,  the  Parzen  window  size,  and 
the  number  of  nearest  neighbors  to  be  used  in  the  classifying  code.  Both  the  target 
and  background  matrices  must  be  created  so  that  the  columns  represent  separate 
data  points  (pixels  in  the  case  of  the  IR  images)  and  the  rows  contain  the  features 
associated  with  those  points.  The  two  matrices  do  not  have  to  be  the  same  size,  but 
they  must  contain  the  same  number  of  rows  (features).  Martin’s  original  classifying 
code  is  then  used  to  determine  the  best  values  to  input  for  the  window  size  and  the 
number  of  nearest  neighbors. 


3.4.1  Bayes  Rule.  In  order  to  understand  the  calculation  of  the  error 
probabilities,  it  is  useful  to  note  the  decision  method.  The  Bayes  decision  rule 
used  in  this  code  is  a  manipulation  of  Equation  2.4.  It  is  called  the  log-likelihood 
discriminant  function  of  x  and  is  expressed  by 


/(a;)  =  —In 


P(^ik)  <' 

p(s2|a;)  S2 


<  0 


[11].  The  a  posteriori  probability  p(5i|x)  is 


p(sila;)  = 


Pi{x)P{si) 

p{x) 


(3.7) 


(3.8) 


where  P{si)  is  the  a  priori  probability  for  class  Si,  p{x)  is  the  pdf  of  x,  and  pi{x)  is  the 
class  conditional  pdf  of  x  for  class  Si.  Substituting  this  into  Equation  3.7,  assuming 
equal  a  priori  probabilities,  replacing  pi{x)  with  its  estimate  Pi{x),  and  introducing 
a  threshold  t  to  compensate  for  biases  in  the  estimation,  the  result  becomes 


[7,  11], 


l{x)  —  —In 


Pi{^)  <'  ^ 

P2{x)  S2 


(3.9) 
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3.4-2  Martin’s  Unmodified  Code.  The  original  unmodified  code  can  be 
seen  in  Appendix  C.  In  this  original  code,  the  hand-segmented  target  and  non-target 
background  sets,  and  a  range  of  values  for  h  and  k  are  passed  to  the  pknn.m  function. 
Then  the  following  procedure  is  performed  five  (the  number  of  tests  specified)  times 
and  the  results  are  averaged  together  to  obtain  the  final  Bayes  error  estimate  plots. 


•  For  each  set,  every  fifth  point  is  taken  as  a  test  sample  and  the  rest  of  the 
points  are  left  over  samples  used  to  create  the  covariance  matrices. 

•  The  sample  sets  and  inverse  covariance  matrices  are  passed  to  the  function 
compute-distances. m  which  computes  the  squared  Mahalanobis  distances  be¬ 
tween  all  the  points,  both  inter-class  and  intra-class.  The  distance  matrices 
formed  are  then  returned  to  pknn.m. 

•  Next,  for  each  value  of  k,  the  density  estimates  and  the  resubstitution  and 
leave-one-out  discriminant  values  are  calculated  for  the  k-NN  technique.  From 
Equation  2.11  the  k-NN  density  estimation  is 


)  ^k-NN 


)1' 


(3.10) 


where  is  the  rth  sample  of  class  Sm-  When  i  =  m,  the  resubstitution 
estimate  is  obtained  by  counting  as  its  own  nearest  neighbor,  while  the 
leave-one-out  estimate  does  not,  but  replaces  Ni  with  (W  — 1)  [11].  Substituting 
this  into  Equation  3.9,  the  discriminant  function  is 


=  In 


iVilEir/' 

iV2lE  211/2 


2 


(3.11) 


These  values  are  passed  to  classify.m  which  calls  min_error.m  to  find  the  re¬ 
substitution  minimum  error  and  threshold.  This  is  done  by  determining  how 
the  two  discriminant  value  sets  overlap,  varying  a  threshold  over  that  region, 
and  counting  the  errors  (misclassifications).  The  resubstitution  minimum  er- 
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ror  and  threshold  values  are  returned  to  classify.m  and  then  the  leave-one-out 
minimum  error  and  threshold  are  calculated  in  a  similar  fashion.  These  error 
values  are  then  returned  to  pknn.m. 

•  The  same  procedure  is  followed  for  the  Parzen  window  technique  for  each 
value  of  h,  but  using  different  equations  for  the  density  estimation  and  the 
discriminant  values.  The  resubstitution  density  estimate  is 


(3.12) 


while  the  leave-one-out  estimate  is 


1 

Ni-1 


Ni 


i=i 


kijO) 

/i" 


(3.13) 


Substituting  these  into  Equation  3.9,  using  a  Gaussian  window,  and  manipu¬ 
lating  the  equations,  the  resulting  likelihood  ratios  for  the  resubstitution  (R) 
method  and  leave-one-out  (L)  methods  for  class  si  and  S2»  respectively,  are 


,  ,uu  ,  Efi,exp(-<iS(4”>.d‘V2V) 

’  JV.IEjr  exp{-4{xf\xf)/2h^) 

,  ,  (2h  ^  |„  IVilEir^^  _  .  ESi  ev(-‘lf(xf\xj‘')/2h^) 

lL[Xj  )  ^  /  Ar  _  1  M  Jl/2  ^  12  r  (2)  J2)\  inin^ 


(3.14) 

(3.15) 

(fV2  -  1)1  Yljli  exp{-dl{xf\xf^)l2hP‘) —  I  ^  ^ 

For  a  complete  description  of  the  steps  involved,  see  the  work  by  Martin  [11]. 

•  All  of  the  error  values  are  then  passed  back  to  the  driver  script  file  where  plots 
can  be  created. 


3.4.3  Modified  Code.  The  modified  code  used  in  creating  the  ROC  curves 
can  be  seen  in  Appendix  D.  The  procedure  used  in  the  modified  code  is  very  similar 
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to  that  in  the  original  code.  The  values  of  h  and  k  that  gave  the  best  results  for  the 
leave-one  out  method  in  the  original  code  are  the  ones  used  in  the  modified  code. 
These  values,  along  with  the  hand-segmented  target  and  background  sets,  are  passed 
to  the  pknn2.m  function.  The  procedure  is  the  same  as  before  except  that  instead 
of  finding  the  minimum  error  and  threshold,  the  min_error2.m  function  determines 
where  the  two  discriminant  sets  overlap  and  varies  a  threshold  over  the  overlapping 
region.  For  each  threshold  value,  the  error  probabilities  are  calculated  using  the 
following  rule:  the  number  of  set  1  discriminant  values  below  the  threshold  divided 
by  the  total  number  of  set  1  discriminant  values  is  the  probability  of  a  hit,  and  the 
number  of  set  2  discriminant  values  below  the  threshold  divided  by  the  total  number 
of  set  2  discriminant  values  is  the  probability  of  false  alarm.  These  probabilities  are 
then  returned  to  the  driver  script  where  they  can  be  plotted  to  obtain  the  final  ROC 
curves  desired. 

Because  the  data  is  finite,  the  probabilities  are  discrete  and  the  ROC  curves 
created  from  them  are  not  smooth.  Each  of  the  ten  tests  creates  ROC  curves  of 
the  form  in  Figure  3.6.  Therefore,  there  needs  to  be  a  way  to  average  these  curves 
together  to  obtain  an  average  ROC  curve  for  each  image  processing  technique.  The 
function  that  does  this  is  interpolate.m,  which  can  also  be  seen  in  Appendix  D.  It 
makes  the  curves  monotonic  increasing  and  then  uses  the  Matlab  function  interpl 
to  interpolate  between  the  points,  as  shown  in  Figure  3.6.  Then,  once  all  ten  curves 
have  a  hit  probability  value  at  each  probability  of  false  alarm,  they  can  be  averaged 
together  to  create  the  final  ROC  curve. 

Due  to  the  averaging  involved,  a  confidence  interval  is  also  needed  for  the  ROC 
curve.  In  this  case,  the  standard  deviation  a  is  unknown,  but  an  estimate  s  can  be 
calculated  from  the  data.  The  distribution  that  can  be  used  to  compute  limits  on  the 
average  value  is  the  Student’s  t  distribution  published  by  W.S.  Gossett  and  perfected 
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ROC 


Interpolated  ROC 


Pfa  J’ta 

Figure  3.6  Example  showing  ROC  curve  and  the  interpolated  ROC  curve  used  in 
averaging. 

by  R.A.  Fisher  [2,  9,  17].  The  confidence  interval  is  given  by 

( P  —  i. 05— T=  iP 1. 05— 

where  p  is  the  average  hit  probability,  s  is  the  standard  deviation  of  the  n  samples 
used,  and  t.os  =  2.262  for  a  95%  confidence  interval  with  (n  —  1)  =  9  degrees  of 
freedom  [17]. 

3.5  Summary 

This  chapter  described  the  methods  used  to  process  the  images  and  create 
ROC  curves.  The  next  chapter  displays  and  describes  the  results  obtained  from 
these  methods. 


3-11 


IV.  Results 


4.1  Preliminary  Results 

In  order  to  create  the  ROC  curves,  values  need  to  be  chosen  for  both  h  and 
k.  This  is  done  using  the  original  unmodified  code  obtained  from  Martin  [11].  The 
output  plots  of  this  code  show  the  total  average  error  for  resubstitution  and  the 
two  leave-one-out  methods  for  varying  values  of  h  and  k,  as  shown  in  the  example 
output  in  Figure  4.1.  Here,  the  average  error  represents  the  percentage  of  total 
misclassifications,  both  false  positives  (false  alarms)  and  false  negatives  (misses). 
The  two  leave-one-out  methods,  described  in  detail  in  Martin’s  thesis,  are  produced 
by  two  different  threshold  selection  techniques  [11].  Since  the  best  values  of  h  and  k 
are  found  by  trial  and  error,  the  h  and  k  values  that  gave  the  smallest  leave-one-out 
(option  2)  error  in  the  output  of  the  original  code  are  used  as  inputs  to  the  modified 
code.  The  h  and  k  values  with  the  next  smallest  leave-one-out  errors  are  used  as 
needed.  In  this  work,  the  leave-one-out  option  2  is  used  because  it  is  less  biased 
than  option  1.  A  compilation  of  the  values  of  h  and  k  that  are  actually  used  in  the 
construction  of  the  ROC  curves  can  be  seen  in  the  tables  in  each  section. 

4-2  Test  Problems 

The  test  problems  consist  of  finding  ROC  curves  for  the  test  data  sets,  whose 
actual  distributions  are  shown  in  Figures  4.2  and  4.4. 

For  Test  1,  both  data  sets  are  Gaussian  distributions  of  points  along  the  x 
axis.  In  the  figure,  they  are  offset  in  the  y  direction  for  illustration  purposes  only. 
The  target  set  has  zero  mean  and  unit  variance,  and  the  non-target  set  has  mean 
fi  =  2  and  unit  variance.  Due  to  the  known  properties  of  this  data,  points  on  the 
empirical  ROC  curve  can  be  found  by  varying  the  threshold  along  the  x  axis.  The 
ROC  curves  for  Test  1  can  be  seen  in  Figure  4.3.  The  values  of  h  and  k  used  to 
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Original  Image  (k-NN)  Original  Image  (Parzen) 


(  -  Resubstitution,  --  Leave-one-out  1,  ..  Leave-one-out  2) 

Figure  4.1  The  output  for  the  original  image  using  Martin’s  unmodified  code. 

create  the  ROC  curves  are  shown  in  Table  4.1,  and  the  areas  under  the  leave-one-out 
curves  are  shown  in  Table  4.2. 


Input  Data 

h 

k 

Test  1 

3 

15 

Test  2 

2 

35 

Table  4.1  Values  chosen  for  h  and  k  for  the  test  sets. 


For  Test  2,  the  target  set  has  zero  mean  and  unit  variance,  and  the  background 
set  has  a  mean  along  the  x  axis  of  ^  =  3  and  unit  variance.  This  is  shown  in 
Figure  4.4.  The  data  is  arranged  so  the  empirical  ROC  can  be  found  by  varying  a 
threshold  along  the  x  axis.  The  ROC  curves  for  Test  2  can  be  seen  in  Figure  4.5.  The 
h  and  k  values  used  are  shown  in  Table  4.1,  and  the  areas  under  the  leave-one-out 
curves  are  shown  in  Table  4.2. 
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Probability  of  Hit 


From  the  figures  for  both  tests,  it  can  be  seen  that  the  resubstitution  and 
leave-one-out  estimates  are  good  approximations  to  the  empirical  ROC  curves  for 
the  test  data. 


Test  2  Data 


x-targel  o-baokground 

Figure  4.4  Data  sets  for  test  2. 


Average  ROC  Curve  (or  Test  2  (k-NN)  Average  ROC  Curve  for  Test  2  (Parzen) 


Figure  4.5  ROC  curves  for  test  2. 
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Table  4.2  Areas  under  the  ROC  for  the  test  data  sets. 


4-5 


4-3  Military  Target  Image  (Targets  v.  Background) 

The  original  IR  image  used  in  this  section  can  be  seen  back  in  Figure  1.2.  The 
actual  values  used  for  h  and  k  are  listed  in  Table  4.3. 


Input  Image 

h 

k 

Original  Image 

3 

20 

CMO  Morphological  Filtered  Image 

0.1 

22 

Band  1  (mean) 

2 

32 

Band  2  (variance) 

9.7 

m 

Band  3  (contrast) 

6.5 

m 

Band  4  (angular  2”^^  moment) 

0 

35 

Band  5  (entropy) 

B 

35 

Band  6  (dispersion) 

0.2 

9 

All  6  bands 

1031 

Bands  1,  2,  3,  &:  6 

1 

20 

Table  4.3  Values  chosen  for  h  and  k  for  the  target  image  of  Figure  1.2. 


4-S.l  Original  Image.  The  ROC  curves  created  using  target  and  back¬ 
ground  matrices  obtained  from  the  original  image  are  shown  in  Figure  4.6.  These 
curves  show  the  probabilities  of  hit  pixels  versus  false  alarm  pixels. 

4.3.2  Spatial  Bands.  The  ROC  curves  obtained  for  the  six  separate  spatial 
feature  bands  can  be  seen  in  Figures  4.7  through  4.12.  The  ROC  curves  for  the 
combinations  of  the  spatial  feature  bands  can  be  seen  in  Figures  4.13  and  4.14.  As 
can  be  seen  in  the  figures  and  table,  each  spatial  band  on  its  own  does  not  outperform 
the  original  unprocessed  image,  the  exception  being  band  1.  But  when  the  bands  are 
combined,  the  resulting  ROC  curves  are  better  then  the  original  ROC.  Therefore, 
combining  the  spatial  bands  will  produce  a  better  segmentation  technique. 


4.3.3  Morphological  Filter.  The  CMO  morphological  filter  in  this  section 
uses  a  10x50  pixel  size  kernel.  The  ROC  curves  in  Figure  4.15  show  that  this  seg- 


Average  ROC  Curve  for  the  Original  Image  (k-NN) 


1 


Average  ROC  Curve  for  the  Original  Image  (Parzen) 


Figure  4.6  ROC  curves  for  the  original  image. 


Average  ROC  Curve  for  Band  1  (k-NN) 


0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1 

Probability  of  False  Alarm 


Average  ROC  Curve  for  Band  1  (Parzen) 


Figure  4.7  ROC  curves  for  band  1  (mean). 
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Probability  of  Hit  _  _  Probability  of  Hit 


Average  ROC  Curve  for  Band  2  (k-NN)  Average  ROC  Curve  tor  Band  2  (Parzen) 


Figure  4.8  ROC  curves  for  band  2  (variance). 


Average  ROC  Curve  tor  Band  3  (K-NN)  Average  ROC  Curve  for  Band  3  (Parzen) 


Figure  4.9 


ROC  curves  for  band  3  (contrast). 
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Probability  of  HI!  Probability  of  Hit 


Average  ROC  Curve  tor  Band  4  (K-NN) 


Average  ROC  Curve  for  Band  4  (Parzen) 


Average  ROC  Curve  for  Bands  1 ,2,3,  &  6  (Parzen) 
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Probability  of  False  Alarm 

Figure  4.14  ROC  curves  for  bands  1,  2,  3,  and  6. 

mentation  technique  performs  better  than  the  original  unprocessed  image  as  well 
as  the  spatial  bands  and  the  combined  spatial  bands.  This  is  the  expected  result, 
considering  the  output  processed  images  are  of  the  types  in  Figures  3. 3-3. 5. 

4. 3-4  Section  Summary.  The  ROC  curves  produced  in  this  section  show 
the  performance  of  the  image  processing  techniques.  Of  the  techniques  used  here, 
the  morphological  CMO  algorithm  provided  the  best  results,  followed  by  the  com¬ 
binations  of  spatial  feature  bands.  These  are  qualified  as  being  better  because  the 
ROC  curve  is  higher  than  those  of  the  unprocessed  image  and  the  single  spatial 
band  features.  Another  way  to  measure  this  is  by  calculating  the  area  under  the 
ROC  curve.  These  areas  give  numbers  that  can  be  compared.  In  many  cases  curves 
with  larger  areas  are  better,  but  that  is  not  always  the  case.  Two  curves  with  equal 
areas  may  look  entirely  different,  so  which  one  is  better  depends  upon  the  specific 
application.  The  areas  under  the  average  leave-one-out  curves  have  been  calculated 
and  tabulated  in  Table  4.4. 


Average  ROC  Curve  for  Bands  1 ,2,3,&  6  (k-NN) 


Probability  of  False  Alarm 
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Probability  of  Hit 


Average  ROC  Curve  for  the  Morphed  Image  (k-NN)  Average  ROC  Curve  to  the  Morphed  Image  (Parzen) 
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0.2 1 


--  Resubstitution 
.  R  bounds 

—  Leave-one-out 

—  Lbounds 


0.1 


0.1 


0.1 


■  _ I  I  1 _ I _ I 

0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1 

Probability  of  False  Alarm 


0.1  0.2 
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Figure  4.15  ROC  curves  for  the  10x50  morphological  CMO  image. 


Input  Image 

ROC  Area 
(k-NN) 

ROC  Area 
(Parzen) 

Original  Image 

.7770 

.7546 

CMO  Morphological  Filtered  Image 

.9744 

.9716 

Band  1  (mean) 

.8006 

.7645 

Band  2  (variance) 

.7050 

.7239 

Band  3  (contrast) 

.7931 

.7480 

Band  4  (angular  2”‘^  moment) 

.7001 

.5152 

Band  5  (entropy) 

.6920 

.5223 

Band  6  (dispersion) 

.6822 

.6920 

All  6  bands 

.8881 

.8886 

Bands  1,  2,  3,  &:  6 

.8600 

.8798 

Table  4.4  Areas  under  the  leave-one-out  ROC  curves. 
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4-4  Scud  Image  (Scud  v.  Background  and  Other  Targets) 


The  image  from  Figure  1.2  is  also  used  in  this  section,  but  instead  of  identifying 
target  versus  non-target  pixels,  scud  pixels  are  identified  versus  non-scud  pixels.  In 
this  case,  the  other  targets  (tanks,  trucks,  etc.)  are  considered  to  be  in  the  non-scud 
pixel  set.  The  h  and  k  values  used  can  be  seen  in  Table  4.5. 


Input  Image 

1 

Original  Image 

s 

m 

CMO  Morphological  Filtered  Image 

n 

Bll 

Band  1  (mean) 

D 

B 

Band  2  (variance) 

D 

Band  3  (contrast) 

D 

m 

Band  4  (angular  2"^*  moment) 

H 

15 

Band  5  (entropy) 

3 

15 

Band  6  (dispersion) 

Q 

1^ 

All  6  bands 

1 

5 

Bands  1,  2,  3,  &  6 

1 

5 

Table  4.5  Values  chosen  for  h  and  k  for  the  scud  image  of  Figure  2. 


4-4-^  Original  Image.  The  ROC  curves  created  using  scud  and  non-scud 
matrices  obtained  from  the  original  image  are  shown  in  Figure  4.16.  Comparing 
Figure  4.16  with  Figure  4.6,  the  scud  versus  non-scud  pixel  classification  did  not 
perform  as  well  as  the  target  versus  non-target.  This  happens  because  the  other 
target  pixels  in  the  image  are  similar  to  scud  pixels,  or  at  least  more  similar  to  the 
scud  pixels  than  the  background  pixels. 

4.4-^  Spatial  Bands.  The  ROC  curves  obtained  for  the  six  separate  spatial 
feature  bands  can  be  seen  in  Figures  4.17  through  4.22.  The  ROC  curves  for 
the  combinations  of  the  spatial  feature  bands  can  be  seen  in  Figures  4.23  and  4.24. 


Average  ROC  Curve  tor  Scud  Original  Image  (k-NN)  Average  ROC  Curve  tor  Scud  Original  Image  (Parzen) 


Figure  4.16  ROC  curves  for  the  scud  original  image. 


Average  ROC  Curve  for  Scud  Band  1  ^-NN)  Average  ROC  Curve  for  Scud  Band  1  (Pareen) 


Figure  4.17  ROC  curves  for  the  scud  band  1  (mean). 
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Figure  4.19  ROC  curves  for  the  scud  band  3  (contrast). 
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Average  ROC  Curve  for  Scud  Band  6  (k-NN)  Average  ROC  Curve  for  Scud  Band  6  (Parzen) 


Probability  of  False  Alarm  Probability  of  False  Alarm 


Figure  4.22  ROC  curves  for  the  scud  band  6  (dispersion). 


Average  ROC  Curve  for  Scud  All  Six  Bands  0^-NN) 


Average  ROC  Curve  for  Scud  All  Six  Bands  (Parzen) 


Probability  of  False  Alarm 


Probability  of  False  Alarm 


Figure  4.23  ROC  curves  for  all  six  bands  for  the  scud. 
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Average  ROC  Curve  for  Scud  Bands  1 ,2,3  &  6  (k-NN)  Average  ROC  Curve  for  Scud  Bands  1 ,2,3  &  6  (Parzen) 


Probability  of  False  Alarm  Probability  of  False  Alarm 

Figure  4.24  ROC  curves  for  bands  1,  2,  3,  and  6  for  the  scud. 

These  again  show  that  combining  the  bands  seems  to  give  better  results  than  using 
each  band  alone. 

4.4-3  Morphological  Filter.  The  ROC  curves  for  the  CMO  morphological 
filtered  scud  image  can  be  seen  in  Figure  4.25.  Just  as  for  the  target  versus  non¬ 
target  section,  the  ROCs  created  from  the  morphological  processed  image  are  better 
than  those  of  the  original  and  the  spatial  bands.  Even  so,  these  ROC  curves  are 
not  as  good  as  those  for  the  target  versus  background  for  the  same  reason  as  stated 
before. 

4-4-4  Section  Summary.  The  ROC  curves  produced  in  this  section  show 
the  performance  of  the  image  processing  techniques  when  used  to  find  only  the  scud. 
Of  the  techniques  used  here,  the  morphological  CMO  algorithm  provided  the  best 
results,  followed  by  the  combinations  of  spatial  feature  bands,  just  as  in  section  4.3 
for  the  target  versus  non-target  problem.  The  areas  under  the  average  leave-one-out 
curves  have  been  calculated  and  tabulated  in  Table  4.6. 


4-18 


Probability  of  Hit 
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Average  ROC  Curve  for  Scud  Morphed  Image  (k-NN) 


Average  ROC  Curve  for  Scud  Morphed  Image  (Parzen) 


Figure  4.25  ROC  curves  for  the  10x50  morphological  CMO  scud  image. 


Input  Image 

ROC  Area 
(k-NN) 

ROC  Area 
(Parzen) 

Original  Image 

.6015 

.5811 

CMO  Morphological  Filtered  Image 

.8528 

.8731 

Band  1  (mean) 

.6188 

.5814 

Band  2  (variance) 

.7337 

.6473 

Band  3  (contrast) 

.6021 

.5805 

Band  4  (angular  2"*^  moment) 

.5187 

.5215 

Band  5  (entropy) 

.5144 

.5289 

Band  6  (dispersion) 

.7393 

.6951 

All  6  bands 

.7756 

.7936 

Bands  1,  2,  3,  &  6 

.7561 

.7459 

Table  4.6  Areas  under  the  leave-one-out  ROC  curves. 
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4-5  Breast  Cancer  Images  (Cancerous  v.  Healthy) 

The  methods  developed  to  estimate  the  hit  and  false  alarm  probabilities  for 
the  military  target  problem  can  also  be  applied  to  other  areas,  such  as  the  breast 
cancer  problem.  The  breast  cancer  images  used  in  this  research  were  provided  by 
The  University  of  Cincinnati.  An  example  image  can  be  seen  in  Figure  4.26. 


Breast  Cancer  Image  with  Tumor 


Figure  4.26  A  breast  cancer  image  with  a  tumor. 

A  part  of  the  image  within  the  marked  area  is  used  as  the  target  set,  and  a  sec¬ 
tion  from  the  upper  right  corner  (outside  the  marked  area)  is  used  as  the  background 
set.  These  signify  tumor  and  healthy  tissue,  respectively.  The  resulting  ROC  curves 
for  /i  =  1  and  k  =  lb  are  displayed  in  Figure  4.27.  The  areas  under  the  average 
leave-one-out  ROC  curves  for  the  k-NN  and  Parzen  methods  are  .8531  and  .8084, 
respectively. 

4-6  Breast  Cancer  Images  (Malignant  v.  Benign) 

As  well  as  finding  tumors,  the  code  is  used  to  find  the  probabilities  associ¬ 
ated  with  finding  malignant  versus  benign  image  pixels  in  the  tumors.  Examples  of 
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1 


Average  ROC  Curve  for  The  Tumor  (k-NN) 


1 


Average  ROC  Curve  for  The  Tumor  (Parzen) 


truthed  pieces  of  the  images  showing  malignant  and  benign  tumors  can  be  seen  in 
Figure  4.28. 

Following  the  ordering  from  left  to  right  in  Figure  4.28,  each  pair  of  malignant 
and  benign  tumors  are  used  as  target  and  non-target  input  sets  to  the  ROC  code. 
The  leave-one-out  ROCs  for  each  trial  are  shown  in  Figure  4.29,  and  a  ROC  for  all 
six  pairs  combined  is  shown  in  Figure  4.30.  Table  4.7  shows  the  values  used  for  h 
and  k  for  each  of  the  six  pairs  of  images. 

The  ROC  curves  produced  in  this  section  show  the  ability  of  the  algorithm 
to  determine  the  probability  of  distinguishing  between  malignant  and  benign  tumor 
pixels.  As  in  previous  sections,  the  areas  under  the  ROC  curves  have  been  calculated 
and  are  listed  in  Table  4.8. 

Next,  the  same  six  pairs  of  malignant  and  benign  tumors  are  put  through  the 
angular  2"^^  moment  spatial  feature  extraction  in  Khoros  and  the  ROC  analysis  is 
performed.  The  h  and  k  values  used  to  create  the  ROC  curves  are  listed  in  Table  4.9. 
The  leave-one-out  ROC  curves  for  each  pair  are  shown  in  Figure  4.31  and  the  ROC 
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Malignant  Tumors 


Figure  4.28  Examples  of  malignant  and  benign  tumors. 


Input  Images 

h 

k 

First  pair 

2 

25 

Second  pair 

5 

20 

Third  pair 

1 

5 

Fourth  pair 

1 

10 

Fifth  pair 

3 

20 

Sixth  pair 

1 

20 

All  6  pairs 

5 

35 

Table  4.7  The  h  and  k  values  for  the  malignant  v.  benign  tumors. 
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Probability  of 


k-NN 


Pair1  Pairs  Pair3  Pair*  PairS  Pair6 


Parzen 


Figure  4.29  ROC  curves  for  malignant  versus  benign  tumors. 


Average  ROC  Curve  for  Malignant  v.  Benign  Tumws  (k-NN)  Average  ROC  Curve  tor  Malignant  v.  Benign  Tumors  (Parzen) 


Probabiiity  of  False  Alarm  Probability  of  False  Alarm 


Figure  4.30  ROC  curves  for  malignant  versus  benign  tumors. 
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Input  Images 

ROC  Area 
(k-NN) 

ROC  Area 
(Parzen) 

First  pair 

.9569 

.9629 

Second  pair 

Third  pair 

.9019 

.8842 

Fourth  pair 

1 

1 

Fifth  pair 

IBSHH 

All  6  pairs 

■bhi 

Table  4.8  Areas  under  the  ROC  for  the  malignant  v.  benign  tumors. 

curve  for  all  six  pairs  combined  is  shown  in  Figure  4.32.  The  areas  under  the  curves 
are  listed  in  Table  4.10. 


Input  Images 

h 

k 

First  pair 

1 

30 

Second  pair 

1 

30 

Third  pair 

3 

35 

Fourth  pair 

2 

40 

Fifth  pair 

2 

50 

Sixth  pair 

1 

30 

All  6  pairs 

4 

30 

Table  4.9  The  h  and  k  values  for  the  malignant  v.  benign  tumors  after  angular  2"^ 
moment  is  performed. 


When  comparisons  are  made  between  the  unprocessed  images’  ROC  curves  and 
areas  and  the  angular  2"^^  moment  processed  images’  ROC  curves  and  areas,  it  can  be 
seen  that  the  unprocessed  images  are  better  classified.  It  turns  out  that  processing 
the  images  through  the  Khoros  angular  2"*^  moment  spatial  feature  extraction  did 
not  improve  the  segmentation  ability  of  the  images.  In  fact,  it  made  it  worse. 
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Probability  of  Hit 


k-NN 


Parzen 

Figure  4.31  ROC  curves  for  malignant  versus  benign  tumors  after  angular  2"“^  mo¬ 
ment  is  performed. 


Average  ROC  Curve  (or  Malignant  v.  Benign  T umors  (k-NN)  Average  ROC  Curve  tor  Malignant  v.  Benign  Tumors  (Parzen) 


Probability  ot  False  Alarm  Probability  ot  False  Alarm 

Figure  4.32  ROC  curves  for  malignant  versus  benign  tumors  after  angular  2”^^  mo¬ 
ment  is  performed. 
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Input  Images 

ROC  Area 
(k-NN) 

ROC  Area 
(Parzen) 

First  pair 

.7051 

.6279 

Second  pair 

.6691 

.5071 

Third  pair 

.7470 

.6025 

Fourth  pair 

.7226 

.6701 

Fifth  pair 

.7306 

.4973 

Sixth  pair 

.7113 

.4987 

All  6  pairs 

.5000 

.4969 

Table  4.10  Areas  under  the  ROC  for  the  malignant  v.  benign  tumors  after  angular 
2nd  ];noment  is  performed. 


4.7  Segmented  Images 

In  this  section  each  pixel  in  the  image  is  classified  as  target  or  background, 
and  a  new,  segmented  image  is  created.  The  thresholds  used  to  make  the  decisions 
are  created  by  putting  hand- segmented  target  and  background  matrices  through  the 
ROC  code,  choosing  a  desired  hit  probability,  and  finding  the  discriminant  value 
that  corresponds  to  that  hit  probability.  After  the  threshold  is  chosen,  discriminant 
values  are  calculated  for  each  pixel.  Each  pixel  is  classified  by  determining  which 
side  of  the  discriminant  threshold  it  falls  on.  A  problem  encountered  was  that  the 
images  were  too  large  to  compute  all  the  new  pixel  values  at  once.  So,  each  line  in 
the  image  was  done  separately. 

The  original  image  is  segmented  using  a  hit  probability  of  80%.  The  morpho¬ 
logical  filtered  image  is  segmented  using  a  hit  probability  of  95%.  These  values  are 
chosen  from  the  ROC  curves  in  Figures  4.6  and  4.15.  The  new  segmented  images 
can  be  seen  in  Figures  4.33  -  4.35.  In  the  segmented  images,  the  background  pixels 
are  black  and  the  target  pixels  are  white. 

From  the  images,  it  can  be  seen  that  the  morphological  filtered  image  out¬ 
performed  the  original  unprocessed  image  in  segmentation.  It  is  easy  to  see  that 
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Figure  4.35  The  segmented  morphological  filtered  image. 
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more  of  the  background  is  correctly  identified  as  background.  But  even  though  this 
morphological  filter  helped  in  the  segmentation,  it  still  did  not  perform  as  well  as 
had  been  hoped. 

To  take  the  segmentation  process  further,  the  segmented  morphological  filtered 
image  is  correlated  with  a  10x50  kernel  and  thresholded.  These  new  segmented 
images,  with  two  different  threshold  values  (250  and  325),  are  shown  in  Figure  4.36. 
This  extra  processing  has  eliminated  much  of  the  background,  but  it  still  shows  false 
alarms  and  misses. 


4-28 


4-8  Summary 

The  tests  performed  using  the  ROC  code  showed  that  the  estimates  give  a 
good  approximation  of  the  empirical  ROC  curves  for  the  data.  On  that  assumption, 
the  algorithms  were  performed  on  real  infrared  images.  These  output  ROC  curves 
and  the  areas  under  them  can  be  used  as  a  measurement  system  to  determine  the 
best  image  processing  algorithms  for  different  images. 
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V.  Conclusions 


5. 1  Introduction 

The  main  objective  of  this  research  is  to  develop  ROC  curves  to  compare  IR 
image  processing  techniques.  The  non-parametric  techniques  used  to  approximate 
the  density  functions  and  create  the  ROC  curves  were  tested  on  problems  with 
known  distributions,  then  applied  to  the  IR  image  problems.  Chapter  II  provided 
the  theory  behind  the  methodology  in  Chapter  III,  and  Chapter  IV  presented  the 
resulting  ROC  curves.  This  chapter  summarizes  the  results  of  this  research. 

5.2  Summary  of  Significant  Results 

5.2.1  Test  Results.  The  test  problems  show  that  the  ROC  curves  created 
from  the  density  estimation  process  are  good  estimates  of  the  empirical  ROC  curves 
of  the  given  data  sets. 

5.2.2  IR  Results.  After  showing  that  the  estimated  ROC  curves  approx¬ 
imate  the  true  curves  for  the  test  data,  the  techniques  are  applied  to  actual  IR 
imagery.  In  trying  to  identify  target  versus  background  pixels,  the  ROC  curves  in¬ 
dicate  that  combining  the  spatial  features  bands  gives  better  results  than  using  no 
processing  on  the  image.  Furthermore,  the  gray-scale  CMO  morphological  opera¬ 
tor  provides  better  results  than  both  the  unprocessed  image  and  the  spatial  feature 
bands.  So  in  this  case,  the  CMO  is  the  better  image  processing  technique. 

Since  the  USAF  is  interested  in  finding  scud  missile  launch  vehicles,  the  ROC 
curve  techniques  are  used  to  identify  scud  versus  non-scud  pixels  in  the  same  images. 
The  results  obtained  were  similar  to  those  of  the  target  versus  background  problem. 
The  CMO  algorithm  is  the  better  processing  technique.  One  result  to  note  is  that 
the  ROCs  created  for  the  target  case  are  better  than  those  for  the  scud  case.  This 
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is  due  to  the  fact  that  the  other  target  pixels  (tanks,  trucks,  etc.)  are  similar  to  the 
scud  pixels,  or  at  least  more  similar  to  the  scud  pixels  than  the  background  pixels. 

5.2.3  Breast  Cancer.  The  ROC  estimation  techniques  are  applied  to  the 
breast  cancer  x-ray  images  in  order  to  show  that  it  is  possible  to  determine  the  hit 
and  false  alarm  probabilities  associated  with  finding  breast  cancer  tumors. 

This  was  taken  one  step  further  to  the  problem  of  determining  the  probabilities 
of  malignant  and  benign  tumors.  The  angular  2”'^  moment  was  performed  on  the 
tumor  images,  and  the  ROCs  obtained  were  compared  with  the  ROCs  from  the 
unprocessed  images.  It  turns  out  that  calculating  the  angular  2"'^  moment  did  not 
improve  the  segmentation  ability  of  the  tumors. 

5.3  Conclusions 

The  results  of  this  thesis  lead  to  the  following  conclusions: 

•  The  test  data  sets  show  that  the  estimated  ROC  curves  developed  in  this 
research  are  a  good  approximation  to  the  actual  ROC  curves  from  the  images. 

•  The  ROC  curves  can  provide  an  effective  means  of  measuring  the  performance 
of  image  processing  and  segmentation  techniques. 

•  Another  means  of  measuring  the  performance  of  image  processing  techniques  is 
the  area  under  the  ROC  curves.  A  larger  area  (with  area=l  being  the  highest) 
generally  signifies  a  better  image  processing  or  segmentation  technique. 

•  The  ROC  analysis  used  here  can  be  used  on  a  variety  of  images  in  different 
applications,  not  just  in  the  military  target  scenario. 

•  Of  the  image  processing  techniques  used  in  this  thesis,  the  CMO  morphological 
filter  produced  the  best  results.  For  the  target  versus  background  problem,  the 
areas  under  both  the  k-NN  and  Parzen  estimated  ROC  curves  were  increased 
from  .7770  and  .7546  for  the  original  image  to  .9744  and  .9716  for  the  mor- 
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phological  processed  image.  For  the  scud  versus  non-scud  problem,  these  ROC 
areas  were  increased  from  .6015  and  .5811  to  .8528  and  .8731. 

5.4  Recommendations 

This  research  can  be  continued  in  the  following  areas: 

•  Instead  of  calculating  discriminant  values  to  create  a  one-dimension  decision 
problem,  leave  the  data  in  n-dimensional  space  and  use  hyper-planes  as  the 
decision  boundaries. 

•  Discover  a  way  to  find  actual  probabilities  of  hit  and  false  alarm  at  the  target 
level  rather  than  the  pixel  level. 
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Appendix  A.  KHOROS 


A.l  Cantata  Workspace 

The  Cantata  workspace  is  the  image  processing  environment  used  in  this  re¬ 
port.  To  get  this  workspace,  you  must  have  the  path  set  up.  This  should  be  done 
when  you  receive  your  account.  If  not,  ask  Dave  Doak  or  Dan  Zambon  to  help  you 
set  the  path.  Once  this  is  done,  just  go  into  a  command  tool  and  type  cantata  to 
get  to  the  cantata  workspace  shown  in  Figure  A.l. 

A. 2  Input  Images 

In  the  workspace,  there  is  a  box  called  input  sources.  Pull  down  the  menu 
from  that  box  and  you  can  either  input  an  image  file  or  create  a  file  in  Khoros.  To 
input  a  file,  choose  input  data  file.  When  the  input  file  window  appears,  you  can 
choose  any  of  the  standard  images,  or  you  can  use  your  own  by  clicking  the  user 
defined  box  and  typing  in  the  file  name  or  using  the  browser.  Use  the  browser  by 
clicking  on  user  specified  file.  This  will  take  you  to  a  file  manager  where  you 
can  choose  your  file.  You  can  move  through  your  directory  and  even  get  a  file  from 
a  classmate’s  directory.  Once  you  choose  your  file,  click  on  the  glyph  box.  This 
will  create  an  input  glyph  containing  your  input  image  that  you  can  position  on  the 
workspace  with  your  mouse. 

A. 3  Using  Glyphs 

Once  you  have  an  input  image,  you  can  begin  processing  it  by  selecting  other 
glyphs  and  connecting  them.  Connect  them  by  clicking  the  right  side  arrow  on  one 
glyph  to  the  left  side  arrow  of  the  next  glyph.  An  example  of  connected  glyphs  in 
a  workspace  is  shown  in  Figure  A. 2.  You  can  run  your  program  by  clicking  on  the 
run  box  or  by  turning  on  each  glyph  separately  using  its  on/off  switch. 
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Figure  A.l  The  Cantata  workspace 


Figure  A. 3  An  example  glyph. 


A-3 


An  example  glyph  can  be  seen  in  Figure  A. 3.  Use  the  arrows  to  connect  the 
glyph  to  other  glyphs  in  the  workspace.  All  of  the  dark  arrows  must  be  connected 
to  something  or  there  will  be  an  error,  but  the  light  arrows  are  only  used  in  some 
options  for  the  glyph.  To  delete  a  glyph,  click  on  the  bomb  symbol  in  the  upper 
left  corner.  To  turn  a  glyph  on,  click  on  the  switch  in  the  upper  right  corner.  To 
change  any  of  the  settings  on  the  glyph,  click  on  the  top  middle  box  to  get  the  menu 
window  for  that  glyph. 

A. 4  Commonly  Used  Functions 

This  section  contains  information  on  where  to  find  some  useful  commands,  as 
well  as  some  examples  and  a  few  things  you  should  know  about  them.  This  is  only 
a  partial  list,  but  it  contains  the  ones  I  found  most  useful. 

A. 4.1  Convolution  (2D).  You  can  find  convolution  under  image  pro¬ 
cessing,  spatial  filters.  You  just  need  to  input  an  image  and  a  kernel.  A  faster 
way  to  convolve  if  you  have  large  images  is  to  take  the  FFT,  multiply,  and  inverse 
FFT.  To  show  you  what  to  expect  as  output,  the  two  images  in  Figure  A. 4  have 
been  convolved  to  give  the  output  seen  in  Figure  A. 5.  The  dimensions  of  the  images 
shown  here  are  proportional  to  the  actual  dimensions  of  the  images  that  were  used. 

A. 4-2  Correlation  (2D).  There  is  not  a  built  in  correlation  function.  To 
perform  a  correlation,  flip  the  input  kernel  image  (both  ways  using  the  flip  option) 
and  perform  a  convolution.  If  you  have  complex  images,  you  must  conjugate  and  flip 
the  kernel.  An  example  of  the  correlation  of  the  two  images  in  Figure  A. 4  is  shown 
in  Figure  A. 6. 

A.4.3  Data  Conversion.  Sometimes,  you  need  to  convert  data  from  one 
type  to  another.  Usually,  Khoros  will  let  you  know  by  giving  you  an  error  and 
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Figure  A. 4  The  input  image  and  kernel. 


Figure  A. 5  The  resulting  convolution. 


Figure  A. 6  The  resulting  correlation. 
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telling  you  which  type  it  needs.  Then  just  go  to  conversions,  data  conversion 
and  choose  the  type  you  need. 


A. 4. 4  Expand.  To  enlarge  an  image,  click  on  image  processing,  geo¬ 
metric  manipulations,  expand  image.  Then  choose  a  scale  factor.  Be  aware 
that  resolution  does  not  increase.  Each  pixel  just  gets  bigger,  as  you  can  see  in 
Figure  A. 8. 


A. 4. 5  Extract  Part  of  an  Image.  Look  under  image  processing,  sub- 
region.  You  can  determine  the  coordinate  values  you  desire  by  running  the  mouse 
over  the  original  image.  An  example  of  a  segmented  image  is  shown  in  Figure  A. 8. 
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Figure  A. 8  An  extracted  image  and  corresponding  expanded  image. 

A. 4 -6  File  Formats.  If  you  want  to  input  or  export  files  from  between 
different  programs  such  as  Khoros  and  Matlab,  just  do  a  file  conversion  by  clicking 
on  conversions,  standard  file  format  and  then  choose  the  file  type  you  need. 

If  you  want  to  convert  between  raw  or  ascii  and  VIFF  files,  look  under  con¬ 
versions,  raw  file  format. 

A. 4-7  Filters  (Frequency).  Frequency  filters  are  found  under  image  pro¬ 
cessing,  frequency  filters.  These  assume  that  your  input  image  is  already  in  the 
frequency  domain.  An  example  output  of  a  high-pass  frequency  filter  is  shown  in 
Figure  A. 10. 

A. 4-8  Filters  (Spatial).  Spatial  filters  are  found  under  image  processing, 
spatial  filters  These  assume  that  the  input  image  is  in  the  space  domain.  The  edge 
extraction  filters  have  the  same  effect  as  converting  to  the  frequency  domain  and 
doing  high-pass  frequency  filtering.  An  example  of  this  is  shown  in  Figure  A.  10. 

A. 4-9  Flip.  Look  under  image  processing,  geometric  manipulations. 
Input  the  image  and  choose  which  way  you  want  to  flip  it.  For  a  correlation,  flip  the 
image  both  top-for-bottom  and  right-for-left. 

A. 4-10  Fourier  Transform  (FFT  &  IFFT).  The  2D  Fourier  transform  is 
located  under  image  processing,  transforms.  You  can  choose  either  forward  or 
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Figure  A. 9  Lenna. 


Figure  A.  10  Lenna  after  using  an  edge-extract  spatial  filter  and  high-pass  frequency 
filter. 
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inverse.  One  requirement  is  that  the  input  images  have  dimensions  that  are  a  power 
of  two,  so  you  may  need  to  pad  them. 

A. 4-11  Multiplication.  Look  under  arithmetic,  binary  arithmetic.  This 
performs  a  pixel  by  pixel  multiplication  of  the  images.  When  you  multiply  two 
images,  they  must  be  the  same  size. 

A. 4-12  Pad  an  Image.  Look  under  image  processing,  subregion.  Input 
the  new  row  size  and  column  size  of  the  padded  image. 


A.4.13  Spatial  Feature  Band  Extraction.  The  Khoros  image  processing 
environment  has  a  built-in  function  to  extract  spatial  features  from  images.  These 
features  are  the  mean,  variance,  contrast,  angular  2”^^  moment,  entropy,  and  disper¬ 
sion.  The  following  equations  describe  how  each  spatial  feature  is  calculated.  The 
algorithm  assumes  there  are  K  gray  levels  in  the  image  and  p{k)  =  the  probability 
of  occurrence  of  each  gray  level  k. 


variance  == 


K 


mean, =  E[k]  =  kp{k) 

(A.l) 

k=0 

K 

■  K 

2 

E[k^]-E[k]^  =  J2k'^p{k)~ 

Y,kp{k) 

(A.2) 

k=0 

.k=0 

K 

contrast  =  P  p{k) 

(A.3) 

k=0 

K 

angular  2"^*  moment  =  P^jk) 

(A.4) 

k=0 

K 

entropy  =  -  logz  p{k) 

(A.5) 

k=0 

K 

dispersion  =  'Y^\k  —  p\  p{k) 

k=0 

(A.6) 
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Figure  A.  11  shows  how  the  Khoros  glyphs  are  connected  in  order  to  extract  the 
spatial  feature  bands  from  the  image.  The  input  glyph  loads  the  image,  arf2vifF 
converts  the  input  data  to  the  VIFF  type  used  in  Khoros,  vextract  is  used  to  extract 
the  356x700  image,  and  vconvert  converts  the  data  type  to  byte  which  is  needed 
in  vspatial.  The  menu  in  the  vspatial  glyph  can  be  used  to  select  the  window  size 
and  the  spatial  feature  bands  to  be  extracted.  The  vbandsptS  glyphs  extract  each 
spatial  band  separately  so  it  can  be  stored  as  a  separate  file  through  the  vifF2mat 
glyph  operation.  After  removing  the  square  brackets  at  the  beginning  and  ending  of 
the  files,  they  can  be  loaded  into  Matlab.  These  new  image  band  files  are  used  in 
the  classifying  code. 


Figure  A.  11  A  Cantata  workspace  showing  the  Khoros  glyphs  used  to  extract  spa¬ 
tial  features. 
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Example 

image 


ou^ut  spatial  band  images 


0 

0 
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0 

0 

0 
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0 

0 
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Figure  A.  12  How  the  Khoros  spatial  bands  are  created. 


Using  the  example  in  Figure  A.  12  where  a  through  /  are  various  gray-scale 
values,  the  following  equations  show  the  calculations  for  the  first  shift  of  the  window 
(where  it  covers  the  nine  pixels  in  the  upper  left  corner.) 


mean:  xl  =  ^+6+c+e+rf+/+a+d+t^ 

variance*  x\  =  ^ g-b^+c-be+t^+Z+tx-f ^ ^ 

contrast:  xl  =  (|)  +  6^  (i)  +  c2  (i)  +  (f*  (|)  +  (l)  +  p  (l) 

ang  2^^^  moment:  xl  =  (§)'+  {if  +  (|)'  +  (§)'  +  (l)'  +  (l)' 
entropy:  xl  =  -  (|)  log2  (|)  -  (|)  log2  (|)  -  (|)  log2  (|)  -  (|)  log2  (|)  - 

(i)  log2  (I)  -  (I)  log2  (I) 

dispersion:  xl  =  |a  -  /i|  (|)+  \b  ~  i^\  (|)-f  \c-  p  (|)+  \d  ~  n\  (§)+  [e-  )u|  (|)  + 
1/  -  fi\  (l)  ,  where  fi  =  ^'’+£+.S+|L/+°+‘^+<^. 


A. 5  User  Defined  Functions 

In  the  image  processing  I  have  been  doing,  it  was  necessary  to  bring  in  a 
program  that  could  convert  ARF  image  files  to  VIFF  files.  The  program  was  obtained 
through  Wright  Lab  and  was  already  put  into  a  form  suitable  for  bringing  it  into 
Khoros.  I’m  not  sure  exactly  how  it  was  done.  All  I  had  to  do  was  to  click  on 
workspace  and  then  file  utilities  to  get  a  menu.  Then  I  clicked  on  UIS  file  and 
got  the  browser  where  I  chose  the  file  I  wanted  to  bring  into  Khoros.  Doing  this 
created  a  glyph  that  could  be  manipulated  like  any  other  glyph  in  Khoros. 
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A. 6  Printing  an  Image 

To  print  an  image  directly  from  Khoros,  you  can  use  the  menu  to  choose 
output  and  then  print  image.  Choose  the  format  you  need  and  then  in  the  send 
to  printer  box,  type  |  Ipr  -PsparcS  (or  any  printer  of  your  choosing.)  Then  run 
that  glyph. 

An  alternative  to  this  is  to  save  the  image  into  your  directory  as  a  .ps  file 
then  you  can  use  a  regular  print  tool  to  print  the  image.  To  do  this,  select  from 
the  Khoros  menu  conversions  then  standard  file  format.  When  you  get  the  new 
window,  select  VIFF  to  Raster,  the  Khoros  input  file,  and  the  name  for  the  output 
file.  This  will  create  a  raster  (.rs)  file  in  your  directory.  Then  just  convert  this  to  a 
.ps  file.  This  can  be  done  in  two  ways: 

1.  In  a  command  tool,  type  convert  oldfilename.rs  newfilename.ps.  The  new 
postscript  file  will  appear  in  your  file  manager. 

2.  In  a  command  tool,  type  xv  filename. rs.  The  image  will  appear.  To  get  a 
menu  of  options,  click  the  left  mouse  button  on  the  image.  When  saving  the 
image,  you  can  choose  the  output  file  type  postscript. 

A.l  Workspace  Utilities 

A. 7.1  Save  a  Workspace.  To  save  a  workspace,  pull  down  the  workspace 
menu  in  cantata.  Choose  file  utilities  and  enter  an  output  file  name  by  typing  it  in 
(or  using  the  browser  if  the  file  already  exists).  After  entering  the  file  name,  press 
return. 

A. 7. 2  Restore  a  Workspace.  To  restore  a  saved  workspace,  again  choose 
workspace  followed  by  file  utilities.  Then  enter  an  input  file  name  by  typing  it  in 
or  using  the  browser.  The  saved  workspace  will  be  restored  to  the  screen. 
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Appendix  B.  Morphology  Code 


B.l  driver  script  file 

%  This  script  file  performs  the  close  minus  open  (CHO)  algorithm  on  an  image, 
y,  (Uses  image  A  and  morphological  kernel  B  to  produce  the  CMO) 

load  origimage.dat; 

A=origimage ; 

B=ones(10,50) ; 

c=closing(A,B) ; 
o=opening(A,B)  ; 
imageout=c-o ; 
save  morphlOxSO  imageout 


B.2  closing. m 

*/,  CLOSING:  This  function  performs  the  gray-scale  closing  of  an  image  by  a 
*/,  structuring  element 


•/. 

'/,  Inputs :  A 

•/.  B 

•/. 

'/,  Output :  c 

•/. 


image 

structuring  element 
closed  image 


fuiiction[c]=closing(A,B) 
temp=dilation(A,B) ; 
c=erosion(temp,B) ; 


B.3  opening. m 

%  OPENING:  This  function  finds  the  opening  of  a  grayscale  image  A  by  the 
%  structuring  element  B. 

% 


%  Inputs: 

y. 

y. 


%  Output : 

t 


A  :  image 

B  :  structuring  element 
c  :  opened  image 


f unct ion [c] =opening( A , B) 
temp=erosion(A,B) ; 
c=dilation(temp ,B) ; 
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B.4  dilation. m 


'/.  DILATION :  This  function  finds  the  grayscale  dilation  of  an  image  by  a 
*/,  structuring  element . 

•/. 

'/.  [dilatedimage]  =  dilation(A,B) 

•/. 


'll  Input  s : 

•/. 

•I, 


'll  Output : 

'll 


A  ;  image 

B  :  structuring  element 
dil:  dilated  image 


f unct ion [dil] =dilat lon( A , B) 

[Arow,Acol]=size(A) ; 

[BroH,Bcol]=size(B) ; 

neHA=zeros(Aron+2*Brow-2, Acol+2*Bcol-2) ; 
neuA (Brow : Ar os+BroH-l , Bcol ; Acol+Bcol-1) =A ; 

for  s=Brow:AroH+2*BroH-2 

for  t=Bcol;Acol+2*Bcol-2 

dil(s-BroB+l  .t-Bcol+l)=maic(mar(neBA(s  :-l  :s-BroH+l  ,t  :-l  :t-Bcol+l)+B))  ; 

end 

end 


B.5  erosion. m 


■/.  EROSION: 

•/. 

t 

%  Inputs: 

% 

% 


y,  Output : 

y. 


This  function  finds  the  gray-scale  erosion  of  an  image  by  a 
structuring  element . 

A  :  image 

B  :  structuring  element 
ero :  eroded  image 


function[ero]=erosion(A,B) 

[AroH,Acol]=size(A) ; 

[Brow,Bcol]=size(B) ; 

for  s=l : Arow-BroH+1 

for  t==l :  Acol-Bcol+1 

ero(s,t)=min(min(A(s :s+Brow-l ,t : t+Bcol-I)-B)) ; 

end 

end 
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Appendix  C.  Martin’s  Code 


C.l  driver  script  file 

%  Driver  script  file  used  to  run  Martin’s  code  and  create  the  plots 

load  targetset; 
load  backgroundset ; 


k=2:50; 
h=.l: .1:10; 

opt=l ; 

[Rpl,  Lpl,  Rkl,  Lkl]  =  pknnCtargetset .backgroundset ,  h,  k,  opt); 
opt=2 ; 

[Rp2,  Lp2,  Rk2,  Lk2]  =  pknnCtargetset .backgroundset ,  h,  k,  opt); 

mRpl=mean(Rpl) ; 
iiiLpl=iiiean(Lpl) ; 
inRkl=mean(Rkl)  ; 
mLkl=mean(Lkl) ; 

[minpl . ipl]  =  min (mLpl ) ; 

[minkl.ikl]  =  rain(mLkl); 

iiiRp2=mean(Rp2)  ; 
iiiLp2=mean(Lp2) ; 
iiiRk2=mean(Rk2)  ; 
mLk2=mean(Lk2) ; 

[minp2,ip2]  =  min(mLp2) ; 

[mink2.ik2]  =  min(raLk2) ; 

figured) 
subplot (2,1,1) 

plotCk,  mLkl,  ’  — k,  mRkl ,  ’-’);  hold  on 
plot([k(ikl)  k(ikl)]  ,  [mLkldkl)  mRkl(ikl)],  ’u*’); 
plotCk,  mLk2,  ’ :  ’)  ; 

plot([k(ik2)  k(ik2)],  [mLk2(ik2)  mRk2(ik2)],  ’»♦’); 

legendC , ’Resubstitution’ , ’  —  ’ , ’Leave-one-out  (optl) ’Leave-one-out  (opt 2)  ’)  ; 
ylabeK ’Average  Error  ('/,)’);  rlabel(’k’); 
title( ’k-HH’) ;  hold  off 

subplot(2,l,2) 

plot(h,  mLpl,  ’ — ’,  h,  mRpl,  ’-’);  hold  on 
plot([h(ipl)  h(ipl)] ,  [mLpl (ipl)  mRpl (ipl)]  ,  ’»♦’); 
ylabeK ’Average  Error  ('/,)’);  rlabel(’h’); 
title( ’Parzen’ ) ; 

legend( ’- ’ , ’Resubstitution’ , ’  —  ’ , ’Leave -one -out  (optl) ’Leave-one-out  (opt 2)  ’ )  ; 
plot(h,  mLp2,  ’:’); 

plot([h(ip2)  h(ip2)] ,  [mLp2(ip2)  mRp2(ip2)] ,  ’»»’);  hold  off 
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C.2  pknn.m 

%  PKHB:  Run  Parzen  and  knn  procedure  for  XI,  X2  10  times. 

y. 

y.  [Rp,  Lp,  Rk,  Lk]  =  PKHHCXl,  X2,  h,  k,  opt) 

y. 

y.  Inputs:  XI,  X2;  data  sets  (n  x  III  and  n  x  12) 

y.  h,  k:  values  to  use  for  h  and  k 

y,  opt:  1  =  threshold  option  3 

y,  2  =  threshold  option  4 

y. 

y. 

y.  All  h’s  must  be  greater  than  zero,  and  k  must  be  betueen  2  and 
y.  min(Hl,  H2)-l 

y. 

y.  Outputs:  Rp,  Lp:  Parzen  R  and  L  errors  for  each  test  (one  test  per  row) 
y,  Rk,  Lk:  k-SH  R  and  L  errors  for  each  test 

function  [Rp,  Lp,  Rk,  Lk]  =  pknnCXl,  X2,  h,  k,  opt) 

[nl,  HI]  =  size(Xl); 

[n2,  H2]  =  size(X2); 
if  nl  ■■=  n2 

fprintf(2,  ’Data  sets  XI  and  X2  must  have  same  number  of  ross 
(f eatures)\n’) ; 

return; 
end  y.  if 

y.  Keep  this  value  on  hand 
dim  =  nl ; 
ntests  =  10; 

Rp  =  zerosCntests ,  size(h,2)); 

Lp  =  zerosCntests,  size(h,2)); 

Rk  =  zeros(ntests,  size(k,2)); 

Lk  =  zerosCntests,  size(k,2)); 

fprintfCl,  ’Threshold  Option  =  '/.d  \n’,  opt); 

fprintfCl,  ’Performing  '/.d  independent  tests:  \n’,  ntests); 

for  test  =  l:ntests, 

fprintfCl,’  Test  ty,d:\n’,  test); 

testsamps  =  test :ntests:Hl; 
tl  =  [1,  testsamps+1] ; 
t2  =  [testsamps-1 ,  HI]; 
others  =  []  ; 
for  i  =  1 :size(tl,2) , 

others  =  [others,  tlCi) :t2(i)] ; 
end  y,  for  i 

xl  =  XlC:,  testsamps); 

151  =  covCXlC :, others) ’) ; 

testsamps  =  test :ntests:H2; 
tl  =  [1,  testsamps+1]; 
t2  =  [testsamps-1,  H2]  ; 
others  =  []  ; 
for  i  =  1 :size(tl,2) , 

others  =  [others,  tl (i) :t2(i)] ; 
end  y,  for  i 

x2  =  X2(:,  testsamps); 

152  =  cov(X2( :, others) ’) ; 
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clear  testsamps  others  tl  t2 


nl  =  sizeCil ,  2) ; 
n2  =  size (x2 ,  2) ; 

fprintfCl,  ’  y,d  Class  1  samples,  Xd  Class  2  samples\n>,  nl,  n2) ; 

fprintfCl,  ’  Estimating  and  inverting  covariance  matrices  ...\n’); 
detratio  =  -0.5  ♦  log(det(iS2)/det(iSl)); 

151  =  inv(iSl); 

152  =  inv(iS2); 

fprintfCl,  ’  Computing  distances  ...\n’); 

[dll,  dl2,  d21,  d22]  =  compute.distancesCrl ,  x2,  iSl,  iS2) ; 

fprintfCl,  ’  Classifying  CParzen)  ...\n’); 

Rerr  =  []  ; 

Lerr  =  []  ; 

for  r  =  h, 

X  Compute  sums: 

temp  =  -0.5  /  r“2; 

sll  =  sumCexpCtemp  *  dll)); 

sl2  =  sumCexpCtemp  *  dl2)); 

s21  =  sumCexpCtemp  »  d21)); 

s22  =  sumCexpCtemp  ♦  d22)); 

Irl  =  detratio  -  logCCn2  ♦  sll)  ./  Cnl  ♦  s21)); 

lr2  =  detratio  -  logCCn2  *  sl2)  ./  Cnl  ♦  s22)); 

111  =  detratio  -  logCCn2  ♦  Csll  -  1))  ./  CCnl  -  1)  *  s21)); 

112  =  detratio  -  logCCCn2  -  1)  ♦  sl2)  ./  Cnl  ♦  Cs22  -  1))); 

[rerr,  lerr]  =  classifyClrl,  lr2.  111,  112,  opt); 

Rerr  =  [Rerr,  rerr]; 

Lerr  =  [Lerr,  lerr]; 

end  X  for  r 

RpCtest,:)  =  100  ♦  Rerr  /  Cnl+n2) ; 

LpCtest,:)  =  100  ♦  Lerr  /  Cnl+n2) ; 


fprintfCl,’  Classifying  Ck-HH)  ...\n’); 

X  sort  distances 
dll  =  sortCdll); 
dl2  =  sortCdl2); 
d21  =  sortCd21); 
d22  =  sortCd22); 

tr  =  detratio  +  logCnl/n2); 
til  =  detratio  +  logCCnl-l)/n2) ; 
tl2  =  detratio  +  logCnl/Cn2-l) ) ; 

Rerr  =  []  ; 

Lerr  =  []  ; 

for  i  =  k, 

Irl  =  tr  +  0.5  ♦  dim  ♦  logCdllCi,:)  ./  d2lCi,:)); 
lr2  =  tr  +  0.5  ♦  dim  ♦  logCdl2Ci,:)  ./  d22Ci,:)); 
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111  =  til  +  0.5  *  dim  *  log(dll(i+l,;)  ./  d21(i,:)) 

112  =  tl2  +  0.5  *  dim  ♦  log(dl2(i,:)  ./  d22(i+l,:)) 

[rerr,  lerr]  =  classify (Irl ,  lr2.  111,  112,  opt); 

Rerr  =  [Rerr,  rerr]; 

Lerr  =  [Lerr ,  lerr] ; 

end  '/.  for  i 

RkCtest,;)  =  100  ♦  Rerr  /  (nl+n2) ; 

Lk(test,;)  =  100  ♦  Lerr  /  (nl+n2) ; 

end  '/,  for  test 


C-4 


C.3  compute-distances. m 

’/,  COHPUTE.DISTAHCES :  Compute  K  distances  between  samples  in  XI  and  X2. 

y. 

•/.  [Dll,  D12,  D21.  D22]  =  C0MPUTE_DISTAHCES(X1 ,  X2,  invSl,  invS2) 

•/. 

'/,  Dll  =  class  1  distances  for  samples  of  class  1 

’/,  D12  =  class  2  distances  for  samples  of  class  1 

'/.  D21  =  class  1  distances  for  samples  of  class  2 

y,  D22  =  class  2  distances  for  samples  of  class  2 

y. 

y.  invSl  and  invS2  are  the  covariance  matrix  inverses  (maybe  estimated)  . 
y,  XI  and  X2  have  one  observation  per  column. 

function  [Dll,  D12,  D21 ,  D22]  =  compute_distetnces(Xl,  X2,  invSl,  invS2) 

y.  Humber  of  columns  is  the  number  of  samples 
HI  =  size(Xl,2); 

H2  =  size(X2 ,2) ; 

y.  Allocate  space 
Dll  =  zeros(Hl,Hl) ; 

D12  =  zerosCHl ,H2) ; 

D21  =  zeros(H2,Hl) ; 

D22  =  zeros(H2,H2) ; 

y.  Calculate  intra-class  disteinces; 

y.  Class  1; 
for  i  =  1 :H1-1 

for  j  =  i+1 ;H1 

Dll(i,j)  =  (Xl<;,j)  -  Xl(;,i))’  *  invSl  ♦  (Xl(:,j)  -  Xl(;,i)); 

Dll(j,i)  =  Dll(i,j); 
end 

end 

y.  Class  2 : 
for  i  =  1 :H2-1 

for  j  =  i+l:H2 

D22(i,j)  =  (X2(:,j)  -  X2(:,i))’  ♦  invS2  *  (X2(:,j)  -  X2(:,i)); 

D22(j,i)  =  D22(i,j) ; 
end 

end 

y.  Calculate  inter-class  distances: 

for  i  =  1:H1  %  Rows  for  D12,  Columns  for  D21 

for  j  =  1:H2  y.  Columns  for  D12,  Rows  for  D21 

%  Pull  samples  out  of  matrices  only  once 
V  =  X2(:,j)  -  Xl(;,i); 

D12(i,j)  =  v’  *  invSl  ♦  v; 

y.  These  could  be  (-v)  ,  but  there’s  no  reason  for  it.  (note  (j,i)) 
D21(j,i)  =  V’  ♦  invS2  ♦  v; 
end 

end 
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C.4  classify,  m 

%  CLASSIFY;  Select  thresholds  and  classify  resubstitution  and 
y,  leave-one-out  discriminant  values. 

y. 

y.  [a,  L]  =  CLASSIFYCLrl,  Lr2,  Lll,  L12,  option) 

y. 

y,  Inputs:  Lrl,  Lr2:  resubstitution  discriminant  values 
y,  Lll,  L12:  leave-one-out  discriminant  values 

y,  option;  1  =  threshold  option  3 

y,  2  =  threshold  option  4 

y. 

y.  Outputs:  R;  Resubstitution  error 
y,  L:  Leave-one-out  error 

function  [R.  L]  =  classify (Lrl ,  Lr2,  Lll,  L12,  option) 

y.  First  get  the  minimum  resubstitution  error  and  threshold 
fprintfCl,’  Resubstitution...’); 

[R,  t]  =  min.error (Lrl ,  Lr2) ; 
fprintf (1 , ’done .\n’) ; 

y.  How,  depending  on  the  option,  get  the  minimum  leave-one-out  error 
fprintfd,’  Leave  one  out...’); 
if  option  ==  1 

L  =  sum(Lll  >  t)  +  sum(L12  <  t); 

else 

nl  =  size (Lll,  2); 
n2  =  size(L12,  2) ; 

L  =  0; 

fprintfd ,  ’Class  1 .  .  .  ’)  ; 
for  i  =  1 :nl , 

[err,  t]  =  min_error(Lll([l:i-l  i+l:nl]),  L12); 
if  Lll(i)  >  t 
L  =  L  +  1; 

end  y.  if  Llld)  >  t 
end  y.  for  i 

fprintfd, ’Class  2...’); 
for  i  =  1 :n2 , 

[err,  t]  =  min.error (Lll ,  L12([l;i-1  i+l:ul])); 
if  L12(i)  <  t 
L  =  L  +  1; 

end  y.  if  L12(i)  <  t 
end  y,  for  i 
end  y,  if  option  ==  1 
fprintfd ,  ’done  .\n’)  ; 
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C.5  min^error.m 


"/,  HI1I_ERR0R:  Select  threshold  which  gives  the  mitiimum  error  for 
*/.  classifying  two  sets  of  discriminants. 

*/,  This  is  designed  to  be  used  for  any  number  of  elements 

'/,  in  each  set.  In  the  case  of  several  minimum  errors, 

'/,  the  threshold  that  yields  the  same  number  of 

'/,  misclassif ications  from  set  1  and  set  2  is  selected. 

•/. 

•/,  [ERROR,  THRESHOLD]  =  MIN.ERRORCSETl ,  SET2) 

y. 

y.  Inputs:  SETl,  SET2:  sets  of  discriminants  from  classes  14  2 

y. 

y.  Outputs:  ERROR:  Hinimum  error  obtainable 

y  THRESHOLD:  "Best?"  threshold  yielding  the  minimum  error 

function  [error,  threshold]  =  min.error (setl ,  set2) 

big  =  realmaz  -  realmin; 
search  =  0; 

%  Eliminate  infinities  from  the  search, 
infs  =  f ind(setl==inf ) ; 
setl(infs)  =  big  *  onesCsizeCinfs)) ; 
infs  =  f ind(setl==(-inf )) ; 
setl(infs)  =  -big  *  ones(size(infs)) ; 
infs  =  f ind(set2==inf ) ; 
set2(infs)  =  big  ♦  ones(size(infs)) ; 
infs  =  f ind(setl==(-lnf )) ; 
set2(infs)  =  -big  ♦  ones(size(infs)) ; 

a  =  min(setl) ; 
b  =  maz(setl) ; 
c  =  min(set2) ; 
d  =  maz(set2) ; 

if  b  <  c 

threshold  =0.S*b+0.5*c; 
error  =  0; 
return; 
end  y,  if 

if  (min([a  b  c  d])  ==  c)  4  (maz([a  b  c  d])  ==  b) 

'/.pick  c  or  b  at  random, 
choice  =  round(rand) ; 
if  choice  ==  0 

threshold  =  b  +  realmin; 

else 

threshold  =  c  -  realmin; 
end  y.  if  choice 

error  =  sumCsetl  >  threshold)  +  sum(set2  <  threshold) ; 
fprintfCl, ’Bad  decision  rule:  a=y.d,  b=y,d,  c=y.d,  d=y.d\n’ ,a,b,c,d) 
return; 
end  y,  if 

if(a<c4c<d4d<b)  I  (c<a4a<b4b<d) 
lo  =  a; 
hi  =  d; 

elseif  (a<c4c<b4b<d) 
lo  =  c; 
hi  =  b; 

else 

y.  Hhat’s  left???  Write  a  message  .write  out  a,  b,  c,  and  d 
fprintfd, ’Equal  condition:  a=y,d,  b=y.d,  c=y,d,  d=%d\n’ ,a,b,c,d)  ; 
threshold  =  .25*a+  .25*b+  .25*c+  .25  *d; 
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error  =  -1 ; 
return; 
end  '/,  if 

'/,  go  fish  (in  a  limited  pool) 

pool  =  [setl(setl>=lo  ft  setl<=hi)  set2(set2>=lo  ft  set2<=hi)]; 

'/,  Get  one  copy  of  each  distinct  value  in  temp 
temp(l)  =  lo; 
n  =  size(pool,2) ; 
for  i  =  2:n, 

temp(i)  =  min(pool(pool>temp(i-l))) ; 
if  temp(i)  ==  hi 
break 
end  '/,  if 
end  y,  for 
n  =  size(temp,2) ; 

errl  =  zeros (1.  n+1); 
err2  =  zerosCl,  n+1); 

y,  Count  the  errors  for  each  guess 
for  i=l;n, 

errl(i)  =  sumCsetl  >=  temp(i)); 
err2(i)  =  sum(set2  <  temp(i)); 
end  y.  for 

errl (n+1)  =  sum(setl  >  temp(n)); 
err2(n+l)  =  suffl(set2  <=  temp(n)); 

y.  Find  the  minimum  total  error 
total  =  errl  +  err2; 
error  =  min(total); 
index  =  find(total  ==  error); 
if  size(index,2)  ==  1 
tidx  =  index; 

else 

diff  =  abs(errl(index)  -  err2(index)) ; 

mindif  =  min(diff ) ; 

didx  =  find(diff  ==  mindif) ; 

nminds  =  size(didx,  2); 

if  nminds  ==  1 

tidx  =  index(didx) ; 

else 

choice  =  round( (nminds- 1)  ♦  rand)  +  1; 
tidx  =  index(didx(choice)) ; 
end  y,  if  nminds 
end  y,  if  size(index,2) 

if  tidx  >  1  ft  tidx  <=  n 

threshold  =  0.5  ♦  temp(tidx)  +  0.5  ♦  temp(tidx-l) ; 
elseif  tidx  ==  1 

lower  =  max([max(setl(setl<temp(l)))  max(set2(set2<temp(l) ))] ) ; 
if  size (loner ,2)  ==  0 

threshold  =  temp(l)  -  realmin; 

else 

threshold  =  0.5  *  temp(l)  +  0.5  *  loner; 
end  y,  if  size  (loner,  2) 

else 

higher  =  min( [min(setl (setl>ten5)(n)) )  min(set2(set2>temp(n)))])  ; 
if  size(higher ,2)  ==  0 

threshold  =  temp(n)  +  realmin; 

else 

threshold  =  0.5  ♦  temp(n)  +  0.5  *  higher; 
end  y,  if  size  (higher,  2) 
end  y,  if  t  idx .  .  . 
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Appendix  D.  Modified  Code  (ROC  Code) 

D.l  driver  script  file 

'/,  Driver  script  file  to  run  the  modified  code  and  create  ROC  curves 

load  targetset; 
load  backgroundset ; 

h=  1;  y.  the  best  h  S  k  from  the  original  code’s  output 

k=  20; 

[rhitp ,rfap,rhitk,rf ak.lhitp, If ap.lhitk, If ak]=pknn2 (targetset .backgroundset ,h,k) ; 

[faave.rhitpave ,rhitkave,lhitpave,lhitkave,rpu,rpl,rku,rkl,lpu,lpl,lku,lkl]=roccurve(rhitp,rfap,rhitk,rfak, 
Ihitp , If ap , Ihitk , If ak) ; 
r=f aave ; 

figured) 

plot(Pfa,Phit, ’y*’) 
hold  on 

plot (r .rhitpave , ’r — ’ .x.lhitpave , ’b- ’ ,x,rpu, ’r : ’ ,x,rpl, ’r : ’ ,r ,lpu, ’b- . ’ ,x,lpl, ’b- . ’) 
axis([0  101]) 

xlabeK ’Probability  of  False  Alarm’); 
ylabeK ’Probability  of  Hit’); 
titleC ’Average  ROC  Curve  (Parzen)’); 

legendC ’r — ’ , ’Resubstitution’ , ’r : ’ , ’R  bounds ’ , ’b-’ , ’Leave-one-out ’ , ’b- . ’ , ’L  bounds ’ , ’y* ’ , ’Actual’) ; 
hold  off 

figure (2) 

plot(Pfa,Phit, ’y*’) 
hold  on 

plot (x.rhitkave, ’r — ’ .x.lhitkave, ’b-’ ,x,rku, ’r : ’ ,x,rkl, ’r : ’ ,x,lku, ’b-. ’ ,x,lkl, ’b- . ’) 
axls([0  101]) 

XlabeK ’Probability  of  False  Alarm’); 
ylabeK ’Probability  of  Hit’); 
titleC ’Average  ROC  Curve  (k-HH)’); 

legendC ’r — ’ , ’Resubstitution’ , ’r : ’ , ’R  bounds ’ , ’b-’ , ’Leave-one-out ’ , ’b- . ’ , ’L  bounds ’ , ’y* ’ , ’Actual’) 
hold  off 

'/,  area  under  leave-one-out  ROC 
ak=sum(lhitkave)/length(lhitkave) ; 
ap=sum(lhitpave) /length (Ihitpave) ; 
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D.2  pknn2.m 


'/,  PKII1I2 :  Run  Parzen  and  MH  procedure  for  XI,  X2  n  times. 

•/. 

'/,  Inputs:  XI,  X2:  data  sets  (n  z  HI  <ind  n  x  N2) 

'/,  h,  k:  values  to  use  for  h  and  k 

•/. 


•/. 


'/,  All  h’s  must  be 
'/,  minCHl,  H2)-l 

•/. 

'/,  Outputs:  rhitp: 
'/.  rfap  : 

'/,  rhitk : 

*/.  rfak  : 

%  Ihitp : 

*/.  Ifap  : 

'/,  Ihitk : 

'/,  Ifak  : 

•/. 


greater  than  zero,  and  k  must  be  between  2  and 


Probability  of 
Probability  of 
Probability  of 
Probability  of 
Probability  of 
Probability  of 
Probability  of 
Probability  of 


hit  for  Parzen  resubstitution 

false  alarm  for  Parzen  resubstitution 

hit  for  k-HH  resubstitution 

false  alarm  for  k-HH  resubstitution 

hit  for  Parzen  leave-one-out 

false  alarm  for  Parzen  leave-one-out 

hit  for  k-HH  leave-one-out 

false  alarm  for  k-HH  leave-one-out 


function[rhitp, rfap, rhitk, rfak, Ihitp, Ifap, Ihitk, lfak]=  pknn2(Xl ,  X2,  h,  k) 

[nl,  HI]  =  size(Xl); 

[n2,  H2]  =  size(X2); 
if  nl  "=  n2 

fprintf(2,  ’Data  sets  XI  and  X2  must  have  same  number  of  rows 
(features)\n’) ; 

return; 
end  y,  if 

*/,  Keep  this  value  on  hand 
dim  =  nl; 
ntests  =  10; 

fprintfCl,  ’Performing  '/.d  independent  tests:  \n’,  ntests); 

rhitp=  []  ; 
rfap=[]  ; 
rhitk=  []  ; 

rfak=n  ; 

lhitp=  []  ; 
lfap=[]  ; 

lhitk=n  ; 

lfak=[]  ; 
counter=0; 

for  test  =  l:ntests, 

fprintfCl,’  Test  #y,d:\n’,  test); 

testsamps  =  test :ntests :H1 ; 
tl  =  [1,  testsamps+1] ; 
t2  =  [testsamps-1,  HI]; 
others  =  []  ; 
for  i  =  1 :size(tl,2) , 

others  =  [others,  tl(i) :t2(i)] ; 
end  '/,  for  i 

xl  =  XlC:,  testsamps); 
iSl  =  covCXlC : ,others) ’) ; 

testsamps  =  test :ntests:H2; 
tl  =  [1,  testsamps+1]; 
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t2  =  [testsamps-l ,  II2]  ; 

others  =  C]  ; 

for  i  =  1 :size(tl,2) , 

others  =  [others,  tl(i) :t2(i)] ; 
end  '/,  for  i 

x2  =  X2(:,  testsamps) ; 
iS2  =  cov(X2( : .others) 0 : 

clear  testsamps  others  tl  t2 

nl  =  sizeCrl ,  2) ; 
n2  =  size(r2,  2) ; 

fprintfCl,  ’  V,d  Class  1  samples,  '/,d  Class  2  samples\n’ ,  nl,  n2) ; 

fprintfCl,  ’  Estimating  and  inverting  covari«ince  matrices  .  .  An’); 
detratio  =  -0.5  *  log(det (iS2)/det (iSD) ; 

151  =  inv(iSl); 

152  =  inv(iS2); 

fprintfCl,  '  Computing  distances  ...\n’); 

[dll,  dl2,  d21,  d22]  =  compute.distancesCxl,  i2,  iSl,  iS2) ; 

fprintfCl,  ’  Classifying  CParzen)  ...\n’); 

for  r  =  h, 

count er=counter+l ; 

y.  Compute  sums: 

temp  =  -0.5  /  r“2; 

sll  =  sumCexpCtemp  ♦  dll)); 

sl2  =  sumCexpCtemp  *  dl2)); 

s21  =  sumCexpCtemp  ♦  d21)); 

s22  =  sumCexpCtemp  ♦  d22)); 

Irl  =  detratio  -  logCCn2  ♦  sll)  ./  Cnl  »  s21)); 

lr2  =  detratio  -  logCCn2  ♦  sl2)  ./  Cnl  *  s22)); 

111  =  detratio  -  logCCn2  *  Csll  -  1))  ./  CCnl  -  1)  *  s2l)); 

112  =  detratio  -  logCCCn2  -  1)  *  sl2)  ./  Cnl  *  Cs22  -  1))); 

[Rhit ,Rfa,Lhit ,Lfa]  =  classify2 Clrl ,  lr2.  111,  112); 

lRhit=lengthCRhit) ; 

IRf a=lengthCRfa) ; 
rhitpCtest ,1 :lRhit)=Rhit ; 
rfapCtest ,1 :lRfa)=Rfa; 

lLhit=lengthCLhit) ; 

ILf a=lengthCLf a) ; 

IhitpCtest ,1 :lLhit)=Lhit ; 

IfapCtest ,1 :lLfa)=Lfa; 

end  y,  for  r 

fprintfCl,’  Classifying  Ck-HH)  ...\n’); 

y,  sort  distances 
dll  =  sortCdll); 
dl2  =  sortCdl2); 
d21  =  sortCd21); 
d22  =  sortCd22); 

tr  =  detratio  +  logCnl/n2); 
til  =  detratio  +  logCCnl-l)/n2) ; 
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tl2  =  detratio  +  log(iil/(ii2-l))  ; 


for  i  =  k, 

Irl  =  tr  +  0.5  *  dim  ♦  logCdlKi,:)  ./  d21(i,;)); 
lr2  =  tr  +  0.5  *  dim  ♦  log(dl2(i,;)  ./  d22(i,:)); 

111  =  til  +  0.5  *  dim  ♦  log(dll(i+l,:)  ./  d21(i,:)); 

112  =  tl2  +  0.5  ♦  dim  ♦  log(dl2(i,:)  ./  d22(i+l,:)); 

[Rhit ,Rfa,Lhit ,Lfa]  =  classify2(lrl,  lr2.  Ill,  112); 

lRhit=length(Rhit) ; 

IRf a=length(Rf a) ; 
rhitkCtest ,1 :lRhit)=Rhit ; 
rfakCtest ,1 :lRfa)=Rfa; 

lLhit=length(Lhit) ; 

ILf a=length(Lf a) ; 

IhitkCtest ,1 :lLhit)=Lhit ; 

IfakCtest ,1 ;lLfa)=Lfa; 

end  */,  for  i 

end  X  for  test 
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D.  3  classify  2.  m 

%  CLASSIFY2:  classify  resubstitution  and  leave-one-ont 
*/.  discriminant  values . 

•/. 

*/,  Inputs:  Lrl,  Lr2:  resubstitution  discriminant  values 
’/,  Lll ,  L12 :  leave-one-out  discriminant  values 

•/. 

*/,  Outputs:  Rhit.Rfa:  Resubstitution  probabilities 
y,  Lhit.Lfa:  Leave-one-out  probabilities 

•/. 


function  [Rhit ,Rf a,Lhit ,Lf a]  =  classify2(Lrl,  Lr2,  Lll,  L12) 

*/,  First  get  the  minimum  resubstitution  error 
fprintfCl,’  Resubstitution...’); 

[Rhit.Rfa]  =  min_error2(Lrl,  Lr2) ; 
fprintf (1 , ’done .\n’) ; 

’/,  Find  the  leave-one-out  error 
fprintfCl,’  Leave  one  out...’); 

[Lhit ,Lfa]=  min_error2(Lll ,L12) ; 
fprintfCl , ’done .\n’) ; 
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D.4  min-error2.m 

%  1IIH_ERR0R2:  Given  two  sets  of  discriminant  values,  finds  the  probabilities 
'/,  of  hit  and  false  alarm 

•/. 

'/,  Inputs:  SETl,  SET2;  sets  of  discriminants  from  classes  1  &  2 

■/. 

'/,  Outputs:  Phit :  Probability  of  a  hit 
%  Pfa  :  Probability  of  false  alarm 

% 


function  [Phit,  Pfa]  =  min_error2(setl ,  set2) 

i=f  ind(setl"=llall)  ; 
temp=setl (i) ; 
setl=temp; 

j=f  ind(set2"=HalI)  ; 
temp2=set2( j) ; 
set2=temp2; 

clear  i  j  temp  temp2 

big  =  realmax  -  realmin; 
search  =  0; 

y.  Eliminate  infinities  from  the  search, 
infs  =  f ind(setl==inf) ; 
setl(infs)  =  big  *  ones(size(infs)) ; 
infs  =  f ind(setl==(-inf )) ; 
setl(infs)  =  -big  *  ones(size(infs)) ; 
infs  =  f ind(set2==inf) ; 
set2(infs)  =  big  *  ones(size(infs)) ; 
infs  =  find(setl==(-inf)) ; 
set2(infs)  =  -big  ♦  ones(size(infs)) ; 

a  =  min(setl) ; 
b  =  mar(setl) ; 
c  =  min(set2) ; 
d  =  max(set2) ; 


y,  determines  hos  the  discriminant  values  of  the  two  sets  overlap 
if  ((b  <  c)  I  (d  <  a)) 

Phit=l ; 

Pfa=0; 
return; 
end  y.  if 


if  (a  <=  c  ft  c  <=  d  ft  d  <=  b) 
lo  =  c; 
hi  =  d; 

elseif  (c<=afta<=bftb<=d) 
lo  =  a; 
hi  =  b; 

elseif  (a  <=  c  ft  c  <=  b  ft  b  <=  d) 
lo  =  c; 


hi  =  b; 

elseif  (c  <=  a  ft  a  <=  d  ft  d  <=  b) 
lo  =  c; 
hi  =  b; 

else 


y.  What’s  left???  bad  decision  rules, 
fprintf (1, ’Bad  decision’); 
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Phit=.5; 

Pf a= . 5 ; 
return; 
end  y,  if 

'/.  go  fish  (in  a  limited  pool) 

pool  =  [setl(setl>=lo  ft  setl<=hi)  set2(set2>=lo  ft  set2<=hi)] 

'/,  Get  one  copy  of  each  distinct  value  in  temp 
temp(l)  =  lo; 
n  =  size(pool,2) ; 
for  i  =  2:n, 

temp(i)  =  min(pool(pool>temp(i-l))) ; 
if  temp(i)  ==  hi 
break 
end  y,  if 
end  y,  for 
n  =  size(temp,2) ; 

errl  =  zerosCl,  n+1); 
err2  =  zeros (1.  n+1); 

y.  Count  the  errors  for  each  guess 
for  i=l;n, 

errl(i)  =  sumCsetl  >=  temp(i)); 
err2(i)  =  sum(set2  <  temp(i)); 
end  y,  for 

errl (n+1)  =  sum(setl  >  temp(n)); 
err2(n+l)  =  sum(set2  <=  temp(n)); 

y.  Find  the  probabilities 
Pmiss=errl ./length(setl) ; 

Phit=l  -  Pmiss; 

Pf a=err2 . /Iength(set2) ; 
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D.5  roccurve.m 


X  ROCCURVE  :  Creates  the  ROC  curves  from  the  hit  and  false  alarm  probabilities. 

•/. 

*/,  Inputs:  rhitp,rhit)c,lhitp,Ihitk  :  hit  probability  matrices 

'/,  rfap,rfak,lfap,lfak  ;  corresponding  false  alarm  matrices 

■/. 

'/.  Outputs:  faave  :  average  false  alarm  probabilities  vector 
’/.  rhitpave,rhitkave,Ihitpave,lhitkave  :  average  hit  probabilities 

*/.  rpu , rku , Ipu , Iku  :  upper  bound  on  95%  confidence  interval 

y,  rpl , rkl , Ipl , Ikl  :  lower  bound  on  95%  confidence  interval 

% 


fnnction[faave,rhitpave,rhitkave,lhitpave,lhitkave,]:pu.rpl,rku,rkl,lpu,lpl,lku,lkl]=roccurve(rhitp,rfap, 
rhitk , rf ak, Ihitp , If ap , Ihitk, If ak) 

[r ,c]=size(rhitp) ; 

x=0: .01:1; 
faave=i; 

rhitpl=interpolate(rfap(l , : ) ,rhitp(l , : ) ,x) ; 
rhitp2=interpolate(rfap(2. :) ,rhitp(2, :) ,x) ; 
rhitp3=interpolate(rfap(3, :) ,rhitp(3, :) ,x) ; 
rhitp4=interpolate(rfap(4, :) .rhitp(4, :) ,x) ; 
rhitp5=interpolate(rfap(5, :) ,rhitp(5, :) ,x) ; 
rhitp6=interpolate(rfap(6, :) ,rhitp(6, :) ,x) ; 
rhitp7=interpolate(rfap(7, :) ,rhitp(7, :) ,x) ; 
rhitp8=interpolate(rfap(8, :) ,rhitp(8, :) ,x) ; 
rhitp9=interpolate(rfap(9, : ) ,rhitp(9, :) ,x) ; 
rhitpl0=interpolate(rfap(10, : ) ,rhitp(10, :) ,x) ; 

rhitkl=interpolate(rfak(l, :) ,rhitk(l, :) ,x) ; 
rhitk2=interpolate(rfak(2, :) ,rhitk(2, :) ,x) ; 
rhitk3=interpolate(rfak(3, :) ,rhitk(3, :) ,x) ; 
rhitk4=interpolate(rfak(4, :) ,rhitk(4, :) ,x) ; 
rhitk5=interpolate(rfak(5, :) ,rhitk(S, :) ,x) ; 
rhitk6=interpolate(rfak(6, :) ,rhitk(6, :) ,x) ; 
rhitk7=interpolate(rfak(7, :) ,rhitk<7, :) ,x) ; 
rhitk8=interpolate(rfak(8, :) ,rhitk(8, :) ,x) ; 
rhitk9=interpolate(rfak(9, :) ,rhitk(9, :) ,x) ; 
rhitklO=interpolate(rf akClO, : ) ,rhitk(10, : ) ,x) ; 

lhitpl=interpolate(lfap(l , : ) ,  Ihitp (1 , ,x) ; 
lhitp2=interpolate(lfap(2, :) ,  Ihitp (2, : ) ,x) ; 
lhitp3=interpolate(lfap(3, :) ,lhitp(3, :) ,x) ; 
lhitp4=interpolate(lfap(4, :) .Ihitp (4, :) ,x) ; 
lhitp5=interpolate(lfap(5, :) ,lhitp(5, :) ,x) ; 
lhitp6=interpolate(lfap(6, :) ,lhitp(6, :) ,x) ; 
lhitp7=interpolate(lfap(7, :) , Ihitp (7, :) ,x) ; 
lhitp8=interpolate(lfap(8, :) ,lhitp(8, :) ,x) ; 
lhitp9=interpolate(lfap(9, :) ,lhitp(9, :) ,x) ; 
lhitplO=interpolate(lf apClO, : ) , Ihitp (10, :) ,x) ; 

lhitkl=interpolate(lfak(l , :) ,lhitk(l , :) ,x) ; 
lhitk2=interpolate(lfak(2, :) ,lhitk(2, :) ,x) ; 
lhitk3=interpolate(lfak(3, :) ,lhitk(3, :) ,x) ; 
lhitk4=interpolate(lfak(4, :) ,lhitk(4, :) ,x) ; 
lhitk5=interpolate(lfak(5 , :) ,lhitk(5, :) ,x) ; 
lhitk6=interpolate(lfak(6, :) ,lhitk(6, :) ,x) ; 
lhitk7=interpolate(lfak(7, :) ,Ihitk(7, :) ,x) ; 
lhitk8=interpolate(lfak(8, :) ,lhitk(8, :) ,x) ; 
lhitk9=interpolate(lfak(9, :)  .IhitkO, :)  ,x)  ; 
lhitkl0=interpolate(lfak(10, :) ,lhitk(10, :) ,x) ; 
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*/,  averages 

rhitpave=(rhitpl+rhitp2+rhitp3+rhitp4+rliitp5+rliitp6+rhitp7+rhitp8+rhitp9+rhitpl0)  /lO; 
rhitkave=  Crhitkl+rliitk2+rhitk3+rhitk4+rhitk5+rhitk6+rhitk7+rhitk8+rhitk9+rhitkl0)  /lO ; 
Ihitpave=(lhitpl+lhitp2+lhitp3+lhitp4+lhitp5+lhitp6+lhitp7+lhitp8+lhitp9+lhitpl0) /lO; 
lhitkave= (Ihitkl+lhitk2+lhitk3+lhitk4+lhitk5+lhitk6+lhitk7+lhitk8+lhitk9+lhitkl0) / 10 ; 

y, standard  deviations 

rp= [rhitpl ; rhitp2 ; rhitp3 ; rhitp4 ; rhitpS ; rhitpS ; rhitp7 ; rhitpS ;rhitp9 ; rhitplO] ; 
rk= [rhitkl ; rhitk2 ; rhitk3 ; rhitk4 ; rhitkS ; rhitk6 ; rhitk7 ; rhitkS ;rhitk9; rhitklO] ; 
lp= [Ihitpl ; lhitp2 ; lhitp3 ; lhitp4 ; IhitpS ; lhitp6 ; lhitp7 ; IhitpS ; Ihit p9 ; IhitplO] ; 
lk= [Ihitkl ; lhitk2 ; lhitk3 ; lhitk4 ; IhitkS ; lhitk6 ; lhitk7 ; IhitkS ; lhitk9 ; Ihitkl 0] ; 

srp=std(rp) ; 
srk=std(rk) ; 
slp=std(lp) ; 
slk=std(lk) ; 


'/,  confidence  intervals 
rpu=rhitpave+2 . 262* (srp . / sqrt (10) ) ; 
rpl=rhitpave-2 . 262* (srp ./sqrt (10)) ; 

rku=rhitkave+2 . 262* ( srk . /sqrt (10) ) ; 
rkl=rhitkave-2 . 262* (srk ./sqrt (10)) ; 

lpu=lhitpave+2 . 262* (sip ./sqrt (10)) ; 
lpl=lhitpave-2 .262* (sip ./sqrt (10)) ; 

lku=lliitkave+2 . 262*(slk .  /sqrt  (10) )  ; 
lkl=lhitkave-2 . 262*(slk./sqrt (10)) ; 
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D.6  interpolate. m 

'/,  IHTERPOLATE ;  Interpolates  points  on  the  given  curves 

•/. 

'/,  Inputs;  fa  :  false  alarm  probabilities 

*/,  hit  ;  corresponding  hit  probabilities 

•/. 

'/,  Output;  neHhit2  ;  interpolated  hit  vector,  same  length  as  neuvector 

y. 


function[neHhit2]=interpolate(fa,hit ,neHvector) 

f a= (round(1000*f a) ) / 1000 ; 
s=mar(fa) ; 

neHfa=[0]  ; 
neHhit=[0]  ; 


y=f ind(f a==0) ;  %  gets  hit  value  for  f a=0 

if  isempty(y)==0 
z=f indChitCy)) ; 
if  isempty (z)==0 

yy=(sum(hit(z)))/length(z) ; 
neHhit=[yy]  ; 
end 

end 

for  i=.001 ; .001 ;s 
y=f ind(fa==i) ; 
if  isempty(y)==0 
neHfa=[neBf  a  i]  ; 
yy=(sum(hit(y)))/length(y) ; 
neHhit=[neHhit  yy] ; 
end 

end 

if  s"=l 

neHfa=[neirfa  1];  %  final  value  for  fa=l 

newhit=[neHhit  1]; 

end 

neHhit2=interpl (newf a,newhit ,newvector) ;  %  does  interpolation 


y,  creates  monotonic  points 
y.  to  put  into  interpl 
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